Ignore punctuation in a sequence of numbers using Regex with Java

3

I have the following input:

Fatura Cliente:  1.7852964.34 
CPF/CNPJ:  09022317000222

I need to get only the "Customer Invoice" numbers, ignoring scores, returning only 1785296434 , for this I am using the following regex:

Fatura Cliente[\D]+(\S+)

But later I need to treat and replace the scores to turn into a sequence of numbers.

How do I get regex to return a sequence of numbers by ignoring scores for the capture group without having to replace later in the code?

Capturing by the first regex already formatted without the scores is it possible or need to give a String.replace or String.replaceAll (regex) after the first capture with regex?

    
asked by anonymous 19.04.2018 / 15:19

2 answers

4

Regex

Regex: \b(\d+)(\.|,|\b)

Result

With these test strings:

Fatura Cliente: 6.823935.10
Fatura Cliente: 6,823935,10
Fatura, Cliente: 6,823935.10
Fatura. Cl1ente: 6.823935,10

And replace with: $1 , where $ 1 means the first capture group.

The following results are obtained:

Fatura Cliente: 682393510
Fatura Cliente: 682393510
Fatura, Cliente: 682393510
Fatura. Cl1ente: 682393510

Try demo on RegexPlanet or FreeFormatter

Explanation \b(\d+)(\.|,|\b)

  • \b - The position of a word boundary, ie the letter can not be followed by another letter.
  • First Capture Group (\d+)
    • \d - Corresponds to the digit equal to [0-9]
    • + - Quantifier that matches from one to unlimited times, as many times as possible (greedy)
  • Second Capture Group (\.|,| |$)
    • | - or
    • \. - Matches literally to the end point
    • , - Matches literally to comma
    • \b - The position of a word boundary, ie the letter can not be followed by another letter.

EDIT:

You would not need Regex, since it's a captured SubString, only replacing semicolons with replace would solve your problem.

Can not do with Regex only. You would need one more step to handle this, either with replace or with another method. There are some modes in the response of @ Douglas .

    
19.04.2018 / 16:15
2

Java implementation

public static void main(String[] argvs) {
    // Com ponto
    String numeroSemPonto = extraiNumeracao("Fatura Cliente: 6.823935.10");
    System.out.println(numeroSemPonto);

    // Com vírgula
    String numeroSemVirgula = extraiNumeracao("Fatura Cliente: 6,823935,10");
    System.out.println(numeroSemVirgula);

    //** Outra opção **//

    // Com ponto
    String numeroSemPonto2 = extraiNumeracao2("Fatura Cliente: 6.823935.10");
    System.out.println(numeroSemPonto2);

    // Com vírgula
    String numeroSemVirgula2 = extraiNumeracao2("Fatura Cliente: 6,823935,10");
    System.out.println(numeroSemVirgula2);
}

public static String extraiNumeracao(String str) {
    Pattern p = Pattern.compile("\d+");
    Matcher m = p.matcher(str);
    String resultado = "";

    while (m.find()) {
        resultado += m.group();
    }

    return resultado;
}

// Outra opção
public static String extraiNumeracao2(String str) {
    return str.split(": ")[1].replace(",", "").replace(".", "");
}
    
19.04.2018 / 15:34