Hello, I'm new to HtmlUnit, I'm trying to capture content from a div that loads later via JavaScript, however I'm facing some exceptions.
My code:
WebClient client = new WebClient(BrowserVersion.CHROME);
HtmlPage pagina = client.getPage("https://www.rico.com.vc/renda-fixa/cdb");
client.getOptions().setThrowExceptionOnScriptError(false);
client.getOptions().setJavaScriptEnabled(true);
client.setAjaxController(new NicelyResynchronizingAjaxController());
client.waitForBackgroundJavaScript(60000);
System.out.println(pagina.asText());
Some of the problems:
INFORMAÇÕES:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Rico.com.vc</title>
<link href='https://fonts.googleapis.com/css?family=Montserrat:400,700|Open+Sans:400,700&subset=latin,latin-ext' rel='stylesheet' type='text/css'>
<style type="text/css">
body
{
margin:0;
padding:0;
font-family:'Trebuchet Ms', Arial, Helvetica;
font-size:12px;
}
.text-center { text-align: center; }
.uppercase { text-transform: uppercase; }
.font-montserrat { font-family: 'Montserrat'; }
.font-open { font-family: 'Open Sans'; }
.font-size-1 { font-size: 16px; }
.font-size-2 { font-size: 28px; }
.padding-bottom-1 { padding-bottom: 10px; }
.padding-bottom-2 { padding-bottom: 30px; }
.msg-error
{
margin-top: -145px;
padding-left: 235px;
font-size: 60px;
color: #FFF;
text-shadow:
-1px -1px 0 #F18719,
1px -1px 0 #F18719,
-1px 1px 0 #F18719,
1px 1px 0 #F18719;
}
.margin-top-1 { margin-top: 145px; }
.font-grey { color: grey; }
.font-bold { font-weight: bold; }
.img_logo { width: 50%; }
.img { width: 110%; padding: 20px 0 30px 0; }
.content { width: 500px; text-align: center; margin: 20px auto; }
.button-wrapper
{
background-color: #EF8A32;
padding: 15px 15px;
border: 1px solid #EF8A32;
border-radius: 4px;
}
.home-link
{
color: white;
text-decoration: none;
font-weight: bold;
}
</style>
</head>
<body onload="document_onload()">
<script type="text/javascript">
function document_onload()
{
lblErro.innerHTML = "";
}
</script>
<div class="content">
<a href="/"><img src="//www.rico.com.vc/rico-base/Rico_logo.jpg" class="img_logo" border="0" alt="Logo Rico" /></a>
<img src="//www.rico.com.vc/dashboard/img/404.png" class="img" alt="Computador com mensagem de erro" />
<div>
<p class="msg-error">404</p>
</div>
</div>
<div class="text-center">
<p class="uppercase font-montserrat font-size-2 font-grey font-bold padding-bottom-1 margin-top-1">A página solicitada não foi encontrada.</p>
<p class="font-open font-size-1 font-grey padding-bottom-2">Caso o problema persista, favor entrar em contato com<br /> nossa central de <a href="https://www.rico.com.vc/servicos/atendimento/contato" class="font-grey font-bold">atendimento.</a></p>
<button class="button-wrapper" type="button">
<a href="/" class="home-link uppercase" title="Voltar para a Home">Ir para a Página Inicial ></a>
</button>
</div>
</body>
</html>
jun 06, 2018 12:45:12 PM com.gargoylesoftware.htmlunit.javascript.DefaultJavaScriptErrorListener loadScriptError
GRAVE: Error loading JavaScript from [https://www.rico.com.vc:443/WebResource.axd?d=p-e1U0PJjdGCHIHBWiD1_mnNyd8XXQ5baJIt17nqS_Wf552pOyyqkjGu6pxXAZ0QL3vedCpP0awH9-IXEKTmPIHCFcY_2PgSBqh3-Kt13gLbD5Wx8QQ_xVePbKJbgc7Nt5QmnNlvk1_kvJEpqvYH5nDIR3o1&t=636177466400000000].
com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 Not Found for https://www.rico.com.vc:443/WebResource.axd?d=p-e1U0PJjdGCHIHBWiD1_mnNyd8XXQ5baJIt17nqS_Wf552pOyyqkjGu6pxXAZ0QL3vedCpP0awH9-IXEKTmPIHCFcY_2PgSBqh3-Kt13gLbD5Wx8QQ_xVePbKJbgc7Nt5QmnNlvk1_kvJEpqvYH5nDIR3o1&t=636177466400000000
at com.gargoylesoftware.htmlunit.WebClient.throwFailingHttpStatusCodeExceptionIfNecessary(WebClient.java:590)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadJavaScriptFromUrl(HtmlPage.java:1034)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:975)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:371)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:246)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:267)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:805)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:761)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1236)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1136)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:226)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:345)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3189)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2141)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:945)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:521)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:472)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:1004)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:253)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:195)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:267)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:158)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:529)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:398)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:315)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:463)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:448)
at br.com.controller.RoboRico.capturarConteudo(RoboRico.java:26)
at org.Robo.Main.main(Main.java:14)
Exception in thread "main" com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 Not Found for https://www.rico.com.vc:443/WebResource.axd?d=p-e1U0PJjdGCHIHBWiD1_mnNyd8XXQ5baJIt17nqS_Wf552pOyyqkjGu6pxXAZ0QL3vedCpP0awH9-IXEKTmPIHCFcY_2PgSBqh3-Kt13gLbD5Wx8QQ_xVePbKJbgc7Nt5QmnNlvk1_kvJEpqvYH5nDIR3o1&t=636177466400000000
at com.gargoylesoftware.htmlunit.WebClient.throwFailingHttpStatusCodeExceptionIfNecessary(WebClient.java:590)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadJavaScriptFromUrl(HtmlPage.java:1034)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:975)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:371)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:246)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:267)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:805)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:761)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1236)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1136)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:226)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:345)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3189)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2141)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:945)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:521)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:472)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:1004)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:253)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:195)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:267)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:158)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:529)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:398)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:315)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:463)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:448)
at br.com.controller.RoboRico.capturarConteudo(RoboRico.java:26)
at org.Robo.Main.main(Main.java:14)