Regex - Removal of Special Characters C #

3

Regex.Replace is a great solution to remove accent.

Now I just can not remove a character type, I have a string that receives the text "1st General Place" , in the string it has the character '°' / strong>, is there a list for these types of characters? How are you doing to eliminate it?

    
asked by anonymous 19.07.2014 / 19:07

2 answers

6

I do not know a specific list for these types of characters.

One approach you can use is the reverse: replace all non characters that belong to a certain range, with a denied range [^ ]

Example

(?i) - Makes regex case incensitive

[^0-9a-záéíóúàèìòùâêîôûãõç\s] - Home all characters that are not of the ranges A to Z (% with_%), 0 to 9 ( a-z ), spaces and similar ( 0-9 ) and accented% / p>

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "Você chegou em 1º lugar, Parabéns!";
      string pattern = @"(?i)[^0-9a-záéíóúàèìòùâêîôûãõç\s]";
      string replacement = "";
      Regex rgx = new Regex(pattern);
      string result = rgx.Replace(input, replacement);

      Console.WriteLine("String Original: {0}", input);
      Console.WriteLine("String tratada : {0}", result);                             
   }
}

Editing

The nice thing about regular expressions is that we can solve the same problem in several ways. Doing some tests here I remembered "%" of% and thus I was able to apply \s (Denied list of alphanumeric with accents and spaces) followed by áéíóúàèìòùâêîôûãõç , resulting in the expected result in a cleaner way: | / p>     

19.07.2014 / 20:37
10

I would use a simple regular expression that just carries letters and numbers:

Regex.Replace(minhaString, "[^0-9a-zA-Z]+", "");
    
20.07.2014 / 02:52