I need to turn a captcha into text, and be specific, to download NFE from the #
I need to turn a captcha into text, and be specific, to download NFE from the #
I'm sure a lot of good answers will come up, but I'll leave my 1 cent here on the subject.
In my searches I've seen a lot about Google Tesseract which is one of the most efficient OCR codes available. But I did not develop anything, it was just research.
I did some tests, but without a customization for the specific purpose, which is the resolution of the recipe captcha, you will have many errors for very few hits.
The latest version of Tesseract
is 3.02.02 , so far.
Until version 2.32, if I'm not mistaken, it was possible to make a wrapper of the Tesseract library that is written in C ++ for C and thus use more easily for other languages. I think today, like the TesseractEngineWrapper for .Net, it's easier to try find something that already does her Wrapper for Java, which is your case.
And I also saw online services that offer to make decaptcha for you, such as captchabot.com
strong> and deathbycaptcha.com . But I have not tested them either.Ah a certain discussion about it being legal or not.
There is a boy who has implemented something with these services, he sends the captcha to the API of one of these sites, the site does the decaptcha for it and returns the text, and then it accesses and works the HTML see here .
His blog is: link . Good, but it's just so you know what has already been done.
But I'd like to encourage you to download NFe, the XML itself, directly from the IRS WebService, you just need to have the certificate.