Convert ISO-8859-1 string to UTF-8 in java

1

My goal is to create a converter from ISO-8859-1 to UTF-8 .

I already have the following code:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
import org.apache.commons.lang3.StringUtils;

public class Converter {

    public static void main(String... args) throws IOException {

    BufferedReader in = null;
    try {
        File fileDir = new File("Mensagens.java");

        in = new BufferedReader(
           new InputStreamReader(new FileInputStream(fileDir), "ISO-8859-1"));

        String strISO;
        String strUTF8 = null;

        while ((strISO = in.readLine()) != null) {
            byte[] isoBytes = strISO.getBytes("ISO-8859-1");
            String value = new String(isoBytes, "UTF-8"); 
            if(strUTF8 == null ){
                strUTF8 = value;
            }else{
                strUTF8 += value;       
            }   
            System.out.println("ISO : "+strISO);
            System.out.println("UTF : "+value);
        }
        }
        catch (UnsupportedEncodingException e){System.out.println(e.getMessage());}
        catch (IOException e){System.out.println(e.getMessage());}
        catch (Exception e){System.out.println(e.getMessage());}
        finally{
            in.close(); 
        }
        //System.out.println(strUTF8);
    }
}

But the output in UTF-8 does not work.

I ask:

What do I need to do in

byte[] isoBytes = strISO.getBytes("ISO-8859-1");
String value = new String(isoBytes, "UTF-8"); 
if(strUTF8 == null ){
    strUTF8 = value;
}else{
    strUTF8 += value;       
}   
System.out.println("ISO : "+strISO);
System.out.println("UTF : "+value);

To make the two ISO outputs UTF equal?

Current output:

ISO : "Já existe lançamento com a mesma Nota Fiscal e Fornecedor.";
UTF : "J� existe lan�amento com a mesma Nota Fiscal e Fornecedor.";

Desired output:

ISO : "Já existe lançamento com a mesma Nota Fiscal e Fornecedor.";
UTF : "Já existe lançamento com a mesma Nota Fiscal e Fornecedor.";

Tests:

PrintStream outISO = new PrintStream(System.out, true, "ISO-8859-1");
PrintStream outUTF8 = new PrintStream(System.out, true, "UTF-8");
outISO.println("ISO : " + strISO);
outUTF8.println("UTF : " + value);
ISO : "J� existe lan�amento com a mesma Nota Fiscal e Fornecedor."; 
UTF: "J� existe lan�amento com a mesma Nota Fiscal e Fornecedor.";


PrintStream outISO = new PrintStream(System.out, true, "UTF-8");
PrintStream outUTF8 = new PrintStream(System.out, true, "UTF-8");
outISO.println("ISO : " + strISO);
outUTF8.println("UTF : " + value);
ISO : "Já existe lançamento com a mesma Nota Fiscal e Fornecedor.";
UTF : "J� existe lan�amento com a mesma Nota Fiscal e Fornecedor.";



PrintStream outISO = new PrintStream(System.out, true, "UTF-8");
PrintStream outUTF8 = new PrintStream(System.out, true, "ISO-8859-1");
outISO.println("ISO : " + strISO);
outUTF8.println("UTF : " + value);
ISO : "Já existe lançamento com a mesma Nota Fiscal e Fornecedor.";
UTF : "J? existe lan?amento com a mesma Nota Fiscal e Fornecedor.";



PrintStream outISO = new PrintStream(System.out, true, "ISO-8859-1");
PrintStream outUTF8 = new PrintStream(System.out, true, "ISO-8859-1");
outISO.println("ISO : " + strISO);
outUTF8.println("UTF : " + value);
ISO : "J� existe lan�amento com a mesma Nota Fiscal e Fornecedor.";
UTF : "J? existe lan?amento com a mesma Nota Fiscal e Fornecedor.";
    
asked by anonymous 30.11.2016 / 13:23

1 answer

3

The problem is that System.out.println shows only in encoding , so to show with encodings different you could use PrintStream :

PrintStream outISO = new PrintStream(System.out, true, "ISO-8859-1");
PrintStream outUTF8 = new PrintStream(System.out, true, "UTF-8");

outISO.println("ISO : " + strISO);
outUTF8.println("UTF : " + value);

Or:

System.setOut(new PrintStream(System.out, true, "ISO-8859-1"));
System.out.println("ISO : " + strISO);

System.setOut(new PrintStream(System.out, true, "UTF-8"));
System.out.println("UTF : " + value);

Internally Java works with UTF-8 so when you read the file you are going from ISO-8859-1 to UTF-8 . Your strISO variable should actually be strUTF8 and its inverted conversion:

byte[] utf8Bytes = strUTF8.getBytes("UTF-8");
String value = new String(utf8Bytes, "ISO-8859-1"); 
    
30.11.2016 / 13:36