I have a string
áéíóú
What I want to convert to
aeiou
How do I remove the accents? I need to save in the database as a URL.
I have a string
áéíóú
What I want to convert to
aeiou
How do I remove the accents? I need to save in the database as a URL.
You can use this function:
public static string RemoveAccents(this string text){
StringBuilder sbReturn = new StringBuilder();
var arrayText = text.Normalize(NormalizationForm.FormD).ToCharArray();
foreach (char letter in arrayText){
if (CharUnicodeInfo.GetUnicodeCategory(letter) != UnicodeCategory.NonSpacingMark)
sbReturn.Append(letter);
}
return sbReturn.ToString();
}
Source: link
You can also read all the characters that are in the variable comAcentos
, and given a Replace
in the parameter that was passed in the function, that is, the letters that are comAcentos
are replaced by the semAcentos
and returns the new text.
public static string removerAcentos(string texto)
{
string comAcentos = "ÄÅÁÂÀÃäáâàãÉÊËÈéêëèÍÎÏÌíîïìÖÓÔÒÕöóôòõÜÚÛüúûùÇç";
string semAcentos = "AAAAAAaaaaaEEEEeeeeIIIIiiiiOOOOOoooooUUUuuuuCc";
for (int i = 0; i < comAcentos.Length; i++)
{
texto = texto.Replace(comAcentos[i].ToString(), semAcentos[i].ToString());
}
return texto;
}
Using LINQ is very convenient:
public static string RemoverAcentuacao(this string text)
{
return new string(text
.Normalize(NormalizationForm.FormD)
.Where(ch => char.GetUnicodeCategory(ch) != UnicodeCategory.NonSpacingMark)
.ToArray());
}
NormalizationForm.FormD
and UnicodeCategory.NonSpacingMark
This is a way of representing the original string so that marks such as accent, cedilla, and others are separated into distinct characters: the base character, which is the letter, and the mark character. The accent character, in this case is called NonSpacingMark
, that is, marker with no space, means that it is a marker that does not occupy any space, and will be applied to the previous character.
Using LINQ we can remove these tags, leaving only the base characters, without the markup, and construct a new string from those characters.
public static string RemoverAcentos(this string texto)
{
if (string.IsNullOrEmpty(texto))
return String.Empty;
byte[] bytes = System.Text.Encoding.GetEncoding("iso-8859-8").GetBytes(texto);
return System.Text.Encoding.UTF8.GetString(bytes);
}
string nome = "João Felipe Portela";
string nomeSemAcentos = nome.RemoverAcentos();
There is this method I use to remove accent:
public static string RemoverAcentos(string texto){
string s = texto.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
for (int k = 0; k < s.Length; k++)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(s[k]);
if (uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(s[k]);
}
}
return sb.ToString();
}
An alternative to the answers given above is to install the following nuget: link
then you can remove the accents as follows:
var str = "áéíóú";
str = str.RemoveDiacritics();
Greek code (codepage) can do this
Information about this codepage can be obtained by returning the System.Text.Encoding.GetEncodings()
method. See more here .
Greek (ISO) has the codepage 28597 and iso-8859-7 .
Let's go to the code ... \ o /
string text = "Você está numa situação lamentável";
string textEncode = System.Web.HttpUtility.UrlEncode(text, Encoding.GetEncoding("iso-8859-7"));
//result: "Voce+esta+numa+situacao+lamentavel"
string textDecode = System.Web.HttpUtility.UrlDecode(textEncode);
//result: "Voce esta numa situacao lamentavel"
Then write this function ...
public string RemoverAcentuacao(string text)
{
return
System.Web.HttpUtility.UrlDecode(
System.Web.HttpUtility.UrlEncode(
text, Encoding.GetEncoding("iso-8859-7")));
}
Note that Encoding.GetEncoding("iso-8859-7")
is equivalent to Encoding.GetEncoding(28597)
. The first search by name, the second by Encoding codepage.
Other options can be seen in the Stackoverflow in English :