How does HtmlUnit work?

3

Someone has an example of a login of a web system communicating with another, where I will send the user and password to another site and this site will authenticate and give me a return if the user and password are correct or not, using HtmlUnit in java.

Example of what I will do: I have a web system which, in order to log in, I want to use another system that the client, which is the same, preferred not to access the database of this other system, so I have a login page, where the user enters the login and password and through HtmlUnit, I send this information in the form of a request, and I get a page with javascript, in response.

    
asked by anonymous 17.06.2015 / 21:16

1 answer

4

In summary, the HtmlUnit has an API that allows Java applications to perform the same actions a user would take in the browser, some examples include invoke a web page, click on buttons and / or links, fill out forms ...

Roughly it is a browser without the graphical interface ─ the project managers call it ─ < in> features and other information can be found on the project page.

Example

Consider an access to the http://meusiteficticio.com page that has a form on the page with this structure:

<form id='form-login' action='/login' method='post'>
   <input name='user' type='text' placeholder='Nome de usuário'/>
   <input name='pass' type='password' placeholder='Senha'/>
   <input type='submit' value='entrar'/>
</form>

By the browser, the user would enter a username and password in the appropriate fields, then click the button to submit the form. We will do the same however within the application.

They implemented ( v2.8 ) and made it public ( v2.11 ) the querySelector and querySelectorAll methods that work similar to the functions that exist in Javascript. To get the same result from the previous code with these methods the code looks like this:

// Obtém a página de login.
HtmlPage paginaDeLogin = new WebClient(BrowserVersion.BEST_SUPPORTED)
                             .getPage("http://meusiteficticio.com");

// Obtém os elementos do formulário.
HtmlTextInput inputNomeDeUsuario = paginaDeLogin.querySelector("input[name='user']");
HtmlPasswordInput inputSenha = paginaDeLogin.querySelector("input[name='pass']");
HtmlSubmitInput botaoEnviar = paginaDeLogin.querySelector("#form-login > input[type='submit']");

// Define o valor do atributo 'value' dos inputs.
inputNomeDeUsuario.setValueAttribute("joao");
inputSenha.setValueAttribute("joao1234");

// Simula o "click" no botão de submit e aguarda retorno
HtmlPage paginaAposOLogin = botaoEnviar.click();

// Mostra o código html da página
System.out.println(paginaAposOLogin.asXml());

If you are using an old version, which does not support querySelector , you will first have to get the form and then get the inputs using the getInputByName method:

// Simulando um navegador Chrome.
WebClient client = new WebClient(BrowserVersion.CHROME);

// Obtém a página.
HtmlPage paginaDeLogin = client.getPage("http://meusiteficticio.com");

// Obtém o formulário de login pelo atributo "id" no html.
// O segundo parâmetro é para aceitar case-sensitive
// e.g "FoRm-LoGiN" também encontraria o formulário.
HtmlForm formularioDeLogin = paginaDeLogin.getElementById("form-login", true);

// Obtém o inputs (do formulário) pelo atributo "name":
HtmlTextInput inputNomeDeUsuario = formularioDeLogin.getInputByName("user");
HtmlPasswordInput inputSenha = formularioDeLogin.getInputByName("pass");

// O "botão" de submit não possui name, id, class, etc.
// Então uma forma de obtê-lo é pelo "value='entrar".
HtmlSubmitInput botaoEnviar = formularioDeLogin.getInputByValue("entrar");

// Insere os valores nos campos de nome de usuário e senha
// (como se estivesse digitando pelo navegador)
inputNomeDeUsuario.setValueAttribute("joao");
inputSenha.setValueAttribute("joao1234");

// Simula o "click" no botão de submit e aguarda retorno
HtmlPage paginaAposOLogin = botaoEnviar.click();

// Mostra o código html da página
System.out.println(paginaAposOLogin.getWebResponse().getContentAsString());

Be cool, treat the exceptions. Attempting to insert (or even manipulate) a value into an input that does not exist will launch a NullPointerException .

Keeping cookies

If you need to keep cookies for use in future requests, you should set CookieManager for your" browser "- read WebClient .

WebClient client = new WebClient(BrowserVersion.FIREFOX_24);
CookieManager cookieManager = client.getCookieManager();
cookieManager.setCookiesEnabled(true);
client.setCookieManager(cookieManager);

HtmlPage fb = client.getPage("https://facebook.com");

Disabling Warnings and Warnings

HtmlUnit will display all warnings that invalidate the Html document, eg obsolete attributes, Javascript and CSS errors - as seen in this image:

YoucanturnoffthesealertsbysettingHtmlUnitloggerleveltoOFF:

Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
    
18.06.2015 / 11:20