In my program I need to open a PDF file and get the text it contains. However when opening the PDF, the text is poorly formatted. For example:
Thanks to my family for not being? measure effort
When the right thing would be:
Thanks to my family for not measuring effort
This only occurs when the PDF is generated by latex. When it is generated by word, the text is normal. The code I'm using to open the pdf is:
int i = 1;//Sendo n o numero de paginas
PdfReader reader = new PdfReader(diretorio);
while(i<=n){
conteudo+=PdfTextExtractor.getTextFromPage(reader, i);
i++;
}
I know it has to do with encoding, but I do not know how to solve / what to do!
Remembering that PDFs will not be generated by me.