Removing numbers at the end of a string Regex C #

7

I have a string that contains the names of some records.

Examples:

string nome = "MARIA APARECIDA DE SOUZA MOURA 636598241";
string nome = "MARIA APARECIDA DE SOUZA MOURA 2018";

I would like to remove the numbers only when they are at the end of the text and where the number of numeric characters is greater than 4.

Examples:

  

MARIA APARECIDA DE SOUZA MOURA 636598241 would be: MARIA APARECIDA DE SOUZA MOURA

and

  

MARIA APARECIDA DE SOUZA MOURA 2017 no change since it only contains 4 numeric characters

I made a few attempts with Regex , but so far unsuccessful

    
asked by anonymous 17.01.2018 / 14:05

4 answers

6
  • To extract the numbers at the end of the string use Regex @"\d+$"
  • Verify that the string result is greater than 4
  • Use Replace and replace with nothing.

See working at dotnetfiddle.

string nome = "MARIA APARECIDA DE SOUZA MOURA 6365945";

var numeros = Regex.Match(nome, @"\d+$").Value;

nome = numeros.Length > 4 ? nome = nome.Replace(numeros,"") : nome;

Reading the regex

\d is a shortHand that is a shortcut to the [0-9] set, that is, search for numeric values.

+ is a quantifier that looks for one or more elements, is the same as {1,}

$ is an border that searches at the end of the text

Another way that follows the same line of thinking: dotnetfiddle

    
17.01.2018 / 14:17
4

You can use the expression \d+$ , it tries to find any digits that are at the end of string .

Constructing a Regex instance with this expression, you can validate the substring size that was found and then do the replacement if this string that was found has more than 4 characters .

See an example:

using static System.Console;
using System.Text.RegularExpressions;

public class Program
{   
    public static string Remover4DigitosFinais(string input)
    {   
        var expressao = new Regex(@"\d+$");     
        var r = expressao.Match(input);     
        return r.Length > 4 ? expressao.Replace(input, "").TrimEnd() : input;
    }

    public static void Main()
    {
        var validacoes = new [] 
        {
            new { Input = "MARIA 2 APARECIDA DE SOUZA MOURA 636598241", Esperado = "MARIA 2 APARECIDA DE SOUZA MOURA" },
            new { Input = "MARIA 2 APARECIDA DE SOUZA MOURA 2018", Esperado = "MARIA 2 APARECIDA DE SOUZA MOURA 2018" },
            new { Input = "JOAO 175", Esperado = "JOAO 175" },
            new { Input = "JOAO 1751233", Esperado = "JOAO" },
        };

        foreach(var val in validacoes)
        {
            var novo = Remover4DigitosFinais(val.Input);            
            var sucesso = (novo == val.Esperado);

            WriteLine($"Sucesso: {sucesso} - Entrada: {val.Input} - Saída: {novo} - Esperado: {val.Esperado}");
        }           
    }   
}

See working in .NET Fiddle.

This is sure to be great if you want to take responsibility for regex and therefore have a shorter and easier to understand expression.

Otherwise, you can simply use the expression (\s\d{5,})+$ , it tries to find any substring where the first character is a space ( \s ), after this space% of digits ( \d ), which are at the end of the main string ( $ ) as long as this combination has size greater than five ( {5,} ).

public static string Remover4DigitosFinais(string input)
{   
    var expressao = new Regex(@"(\s\d{5,})+$");             
    return expressao.Replace(input, "");
}

See working in .NET Fiddle.     

17.01.2018 / 14:29
1

The search for the end would be +$ with the search of numbers with more than 4 digit [0-9]{5,} , and the final expression also checking the space: "(\s[0-9]{5,})+$" :

string nome0 = "0 - MARIA APARECIDA DE SOUZA MOURA 636598241";
string nome1 = "1 - MARIA APARECIDA DE SOUZA MOURA 2018";
string nome2 = "2 - MARIA APARECIDA DE SOUZA 636598241 MOURA";

string strRegex = "(\s[0-9]{5,})+$";

string resu0 = Regex.Replace(nome0, strRegex, "");
string resu1 = Regex.Replace(nome1, strRegex, "");
string resu2 = Regex.Replace(nome2, strRegex, "");
17.01.2018 / 14:33
1

One tip is to use a character as a separator and use the .slip () function to transform your string into an array, so you are free to get the name, or just the number at any time with ease.

Example:

string nomeNumero = "MARIA APARECIDA DE SOUZA MOURA | 636598241";
string nome = nomeNumero.Split("|")[0];
string numero = nomeNumero.Split("|")[1];
    
17.01.2018 / 18:47