Problem with encoding in Excel file reader for Java

2

I have a relatively "common" problem. I have a Java program that makes importing an Excel spreadsheet and one of the column fields has words with accents, cedillas, etc.

At the time of reading the variable, you are always marking with a black diamond in these special characters. I already tried some solutions with normalizer , getBytes() with all possible encodings and tried to use something like this:

WorkbookSettings ws = new WorkbookSettings();  
ws.setEncoding("Cp1252");

But nothing solved: (

The main code (to understand the problem) is:

Workbook workbook = Workbook.getWorkbook(new File(diretorio,v_arquivo));
Sheet sheet = workbook.getSheet(0);
Cell[] celula;

for (int i = 1; i < sheet.getRows(); i++){
        celula = sheet.getRow(i);
        if (celula.length > 0){
            evento = celula[7].getContents().trim(); 
        }
}

And my Event String appears as for example lactation Thank you for your attention.

ps: I'm new to the forum, I'm still learning the formatting, sorry for the possible errors.

    
asked by anonymous 20.03.2015 / 17:46

1 answer

2

Well, no one has commented but if someone has this problem in the future, I'll leave what I found registered here (I managed to get around the problem in a way). In case of using this variable in any SQL query (which was my problem, because it was not compatible with possible selects with the database) I used the following function:

public static String formatString(String s) {  
        String temp = Normalizer.normalize(s, java.text.Normalizer.Form.NFD);  
        return temp.replaceAll("[^\p{ASCII}]","%");  
}

What is done: Only special characters are substituted for "%", so when executing an SQL query the word will normally be found in the table. But as I said, it's just an outline for the problem ... it's not a solution: x

    
27.03.2015 / 15:54