Here's how I'm creating a program for conducting CPF and CNPJ inquiries on the federal revenue site. It is not an attempt to circumvent the system. I'm bringing the captcha to a PictureBox so the user can just enter the characters without typing the cnpj or cpf, which would be populated automatically.
The part that performs the CPF query is simpler, I just filled the fields of the http header manually, defining the session cookie obtained in the captcha request. The problem is in the consultation of the CNPJ.
The federal revenue site submits a request with the following header:
POST http://www.receita.fazenda.gov.br/pessoajuridica/cnpj/cnpjreva/valida.asp HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Referer: http://www.receita.fazenda.gov.br/pessoajuridica/cnpj/cnpjreva/cnpjreva_solicitacao2.asp
Accept-Language: pt-BR,pt;q=0.5
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Content-Length: 111
DNT: 1
Host: www.receita.fazenda.gov.br
Pragma: no-cache
Cookie: flag=1; ASPSESSIONIDAQDSBCCD=ENPHBBJDCNDBAFPALIEEGOPP; ASPSESSIONIDASDSBCCC=EJGNAFJDDFCIOBAHNIFCPEHC
After this he makes three other requisitions, with information from the CNPJ consulted being brought in after the third.
WhatIcannotdoisreproducethisquerywithanHttpWebRequest,eventhoughIhavedefinedalltheheaders.
privatevoidconsultarPessoaJuridica(){stringpostContent="origem=comprovante"
+ "&cnpj=60872504000123"
+ "&txtTexto_captcha_serpro_gov_br=" + textBoxCaptcha1.Text
+ "&submit1=Consultar"
+ "&search_type=cnpj";
byte[] postBytesArray = Encoding.UTF8.GetBytes(postContent);
WebRequest webRequest = WebRequest.Create("http://www.receita.fazenda.gov.br/pessoajuridica/cnpj/cnpjreva/valida.asp");
((HttpWebRequest)webRequest).MaximumAutomaticRedirections = 4;
((HttpWebRequest)webRequest).AllowAutoRedirect = true;
((HttpWebRequest)webRequest).Timeout = 10000;
webRequest.Headers.Add("Pragma", "no-cache");
((HttpWebRequest)webRequest).Accept = "text/html, application/xhtml+xml, */*";
webRequest.Headers.Add("Accept-Encoding", "gzip, deflate");
webRequest.Headers.Add("Accept-Language", "pt-BR,pt;q=0.5");
((HttpWebRequest)webRequest).UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko";
webRequest.Headers.Add("Cookie", this.cookie + "; flag=1");
webRequest.Headers.Add("DNT", "1");
((HttpWebRequest)webRequest).ContentLength = postBytesArray.Length;
((HttpWebRequest)webRequest).ContentType = "application/x-www-form-urlencoded";
((HttpWebRequest)webRequest).Referer = "http://www.receita.fazenda.gov.br/pessoajuridica/cnpj/cnpjreva/cnpjreva_solicitacao2.asp";
((HttpWebRequest)webRequest).KeepAlive = true;
((HttpWebRequest)webRequest).Host = "www.receita.fazenda.gov.br";
webRequest.Method = "POST";
// Get the request stream.
using (Stream dataStream = webRequest.GetRequestStream())
{
// Write data to the request stream.
dataStream.Write(postBytesArray, 0, postBytesArray.Length);
}
using (WebResponse webResponse = webRequest.GetResponse())
using (Stream dataStream = webResponse.GetResponseStream())
using (StreamReader reader = new StreamReader(dataStream))
{
string responseFromServer = reader.ReadToEnd();
}
}
This request appears in Fiddler as follows:
POST http://www.receita.fazenda.gov.br/pessoajuridica/cnpj/cnpjreva/valida.asp HTTP/1.1
Pragma: no-cache
Accept: text/html, application/xhtml+xml, */*
Accept-Encoding: gzip, deflate
Accept-Language: pt-BR,pt;q=0.5
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko
Cookie: ASPSESSIONIDCQCSADDD=EGGPDBLDCBJEPHJJONMEHHCP; flag=1
DNT: 1
Content-Type: application/x-www-form-urlencoded
Referer: http://www.receita.fazenda.gov.br/pessoajuridica/cnpj/cnpjreva/cnpjreva_solicitacao2.asp
Host: www.receita.fazenda.gov.br
Content-Length: 111
Expect: 100-continue
I believe this is occurring because, for some reason, the status code obtained after the second request is 200, and no other automatic redirection occurs after that.