I've done a scrapping in Python which takes a URL from any PDF, reads and returns, however in some PDFs I'm having the problem come with some characters like this:
". \ nO \ xc3 \ xb3rg \ xc3 \ xa3o also \ xc3 \ xa9m discloses result \ n \ nGH \ xc3 \ x80QLWLYR \ x03GRV \ x03FDQGLGDWRV \ x03TXH \ x03VH \ x03GHFODUDUDP \ x03FRP \ x03GH \ xc3 \ x80FLrQFLD \ x03H \ x03GRV \ x03SHGLGRV \ x03 \ n special assistance granted. \ nThe contest aims to provide \ nefetivo of 150 places for the \ ninitial class (Class A) of the position of delegate of Civil Poll, whose vacancies will be \ xc3 \ xa3o \ n \ nproved according to order of clasVL \ xc3 \ x80FDomR \ x03H \ x03D \ x03QHFHVVLGDGH \ x03GR \ x03VHUYLoR \ x11 \ nA "
From what I could see, this happens when you have some accent, column or even trace in the document.
I also noticed that if you have an image, it returns strange characters! Does anyone have a solution or idea that can help me?