AND operator in regex

5

I have the following date / time format:

25/01 / 2017a1111: 53: 37

And the following regex:

  

REGX_DATAHORA_DISTRIBUICAO = "(?<data>\d{1,2}\/\d{1,2}\/\d{4})|(?<hora>\d{1,2}:\d{1,2}:\d{1,2})"

    private OffsetDateTime getDataDistribuicao() {
    String textoData = replaceAndTrim(this.getPaginaInfoGerais().<HtmlTableCell>getFirstByXPath(XPATH_CEL_DATA_DISTRIBUICAO)
            .getTextContent());
    return LocalDateTime
            .parse(getDataDistribuicao(textoData),
                    DateTimeFormatter.ofPattern(PATTERN_DATA_HORA))
            .atOffset(ZoneOffset.UTC);
}

private String getDataDistribuicao(final String dataTexto)  {
    final Matcher matcherDataHora = REGX_DATAHORA_DISTRIBUICAO.matcher(dataTexto);
    if (matcherDataHora.find()) {
        return matcherDataHora.group();
    } else {
        throw new RegexException("Data distribuição", REGX_DATAHORA_MOVIMENTACAO.pattern(), dataTexto);
    }
}

The regex has 2 groups, but only one group is returned, the one of the date. The other time group returns as null. I suppose it's because of the operator ...
I've already tried to use (?= ( positive lookahead ), but maybe I used it wrong.
What to do?

    
asked by anonymous 09.10.2017 / 15:30

2 answers

2
  

It has 2 groups, but the two groups only return the part of the   date: 1/25/2017.

Actually, your groups return different things:

  • (?<data>\d{1,2}/\d{1,2}/\d{4}) - Returns: 25/01/2017

  • (?<hora>\d{1,2}:\d{1,2}:\d{1,2}) - Return: 11:53:37

  • I did a test to prove this, you can see it here .

      

    I imagine it's on the operator's behalf

    I actually believe that you are not returning the entire result because you are using the Matcher.group() method, this method returns matchs from a specific capture group, you can read the documentation about using it here .

      

    What to do?

    You can use:

    private String getDataDistribuicao(final String dataTexto)  {
        final Matcher matcherDataHora = REGX_DATAHORA_DISTRIBUICAO.matcher(dataTexto);
        if (matcherDataHora.find()) {
            StringBuilder dataHora = new StringBuilder();
     dataHora.append(matcherDataHora.group("data")).append(matcherDataHora.group("hora"));
            return dataHora.toString;
        } else {
            throw new RegexException("Data distribuição", REGX_DATAHORA_MOVIMENTACAO.pattern(), dataTexto);
        }
    }
    

    If it does not work, I suggest you try debugging the return values matcherDataHora.group("data") and matcherDataHora.group("hora") , if one of the two returns is empty, check if the input value you put here is correct, since regex should capture this standard.

        
    09.10.2017 / 16:11
    2

    Well, from what I researched, this problem is due to the fact that the operator | (OR) in Java consider only one group. So what I did was get around this as I saw in some forums:

        private String getDataDistribuicao(final String dataTexto)  {
        String[] grupos = dataTexto.split("às");
        StringBuilder dataHora = new StringBuilder();
        for(String grupo: grupos){
            final Matcher matcherDataHora = REGX_DATAHORA_DISTRIBUICAO.matcher(grupo);
            if (matcherDataHora.find()) {
                dataHora.append(" ").append(grupo);
            } else {
                throw new RegexException("Data distribuição", REGX_DATAHORA_DISTRIBUICAO.pattern(), dataTexto);
            }
        }
        return dataHora.toString();
    }
    
        
    10.10.2017 / 13:29