Error with GET HTTP Special Characters

1

I have some problems when making HTTP GETs.

When the page text has a special character, the response becomes distorted.

Example:

os participantes deverão:

when the original text was os participantes deverão .

The code I'm using to do this get is as follows:

        try {
        URL url = new URL("***Url***");
        BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
        String strTemp = "";
        while (null != (strTemp = br.readLine())) {
            System.out.println(strTemp);
        }
    } catch (Exception ex) {
        ex.printStackTrace();
    }

Any idea what might be causing this problem?

    
asked by anonymous 28.12.2015 / 19:12

1 answer

4

These constructs of type & and ã are called HTML character entities . They are not errors and have probably come from the original page like this. They are used to represent reserved HTML characters.

See in this SOen response some ways to replace them with the appropriate characters. The most popular one seems to be using the StringEscapeUtils.unescapeHtml4 () of the Apache Commons library.

    
28.12.2015 / 19:59