How to remove accents and other graphic signals from a String in Java?

52

How to remove accents and other graphic signals from a String in Java? Ex .:

String s = "maçã";
String semAcento = ???; // resultado: "maca"
    
asked by rodrigorgs 11.12.2013 в 17:36
source

2 answers

69

I usually use regex along with the Normalizer class. So:

public static String removerAcentos(String str) {
    return Normalizer.normalize(str, Normalizer.Form.NFD).replaceAll("[^\p{ASCII}]", "");
}
    
answered by 11.12.2013 / 17:39
source
5

If it's Java7 + you can use this solution found in SOen link

First import this:

import java.text.Normalizer;
import java.util.regex.Pattern;

Then add this to your main class or to another class you use:

public static String deAccent(String str) {
    String nfdNormalizedString = Normalizer.normalize(str, Normalizer.Form.NFD); 
    Pattern pattern = Pattern.compile("\p{InCombiningDiacriticalMarks}+");
    return pattern.matcher(nfdNormalizedString).replaceAll("");
}

And at the time of use would look something like:

System.out.print(deAccent("Olá, mundo!"));

It makes use of the regular expression () for change them: \p{InCombiningDiacriticalMarks}+

See working at IDEONE: link

    
answered by 06.12.2017 в 19:27