Doubt, regular expression in java

6

I have the following regular expressions. The first valid words and you're right. The problem is in the second that is to validate directory, such as " home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt ". I can not get her to validate such a path. Can anyone help me?

public class validador {

    public boolean validarPalavra(String palavra) {
       Pattern p = Pattern.compile("[A-Z0-9a-z]*");
       Matcher retorno = p.matcher(palavra);
       return retorno.matches();
    }

    public boolean validarCaminho(String caminho) {
       Pattern p = Pattern.compile("//([a-zA-z0-9])+");
       Matcher retorno = p.matcher(caminho);
       return retorno.matches();
    }
}
    
asked by anonymous 08.11.2015 / 20:34

2 answers

7

First, that Pattern is an expensive object to create. However, it is immutable and reusable, so it is best that each is created only once each.

Second, you used the second expression [a-zA-z0-9] . The a-z are lowercase and 0-9 are numbers. But A-z is wrong because z should be uppercase. But still the regular expression should be much more complicated. The correct regular expression (one of the possible ones) would be:

^(?:(?:[A-Z]\:)?\/)?(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*)(?:\/(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*))*(?:\.[a-zA-Z0-9]+)?|[A-Z]\:\/?$

Explanation of regular expression:

  • ^ - String start.

  • (?:(?:[A-Z]\:)?\/)? - Here are some things:

    • (?: ... ) - Used to group without capturing. We have two groups of this.
    • [A-Z] - A capital letter.
    • \: - The character : after the uppercase letter.
    • (?:[A-Z]\:)? - Capital letter followed by : may or may not appear (because of ? ).
    • \/ - The character / , which may be after the uppercase letter followed by : or soon at the beginning of the string.
    • The last ? of (?:(?:[A-Z]\:)?\/)? . It means that / or uppercase letter followed by :/ can be omitted.

    That is, this part is used to recognize the prefix of the path. Therefore, in paths of type C:/texto , /texto and only texto , this is responsible for recognizing the preceding texto .

  • (?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*) - The name of a directory. Here we also have several things:

    • (?: ... ) - Two groupings without capture.
    • [a-zA-Z0-9]+ - One word. You must have at least one letter (because of + ).
    • (?: [a-zA-Z0-9]+)* - A space followed by a word. The * indicates that it can occur zero or more times. Whenever a space occurs, there should be a word right away.

    In this way, a directory name consists of a set of one or more words separated by space. Consecutive multiple spaces are not allowed. Spaces at the end or beginning of the name are not allowed.

  • (?:\/(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*))* - Here are four things:

    • (?: ... ) - Group without doing capture.
    • \/ - The / character.
    • (?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*) - Same as previously shown. This is a directory name occurring shortly after / .
    • * - Repeat the whole group as many times as necessary (including possibly zero times).

    That is, this part recognizes all \palavras after the first term. Even if there is no \palavras after the first term.

  • (?:\.[a-zA-Z0-9]+)? - Again four things:

    • (?: ... ) - Another group without capture.
    • \. - The . character.
    • [a-zA-Z0-9]+ - A word after . . You must have at least one letter. Note that no spaces are allowed here (this part is the file extension).
    • ? - The group may or may not appear.

    This part therefore recognizes .extensão at the end, which is optional.

  • |[A-Z]\:\/? - Everything there is before recognizes the full path. However, since in the previous part of it all, the first word is mandatory, so paths such as C: and C:\ would not be recognized. So we have | (which means that this is an alternative if what you have before fails) followed by a capital letter ( [A-Z] ), : and optional / ( \/? ).

  • $ - End of string.

It is still important to note that in Java the \ character is used in strings for escape sequences (such as \n for line breaks). Since we do not want to use escape sequences, we have to use \ to represent \ . So, to construct \ , \/ and \. of the regular expression, in the source code we have to use \: , \/ and \. , as you can see in the code below:

import java.util.regex.Pattern;

public class Validador {

    private static final Pattern p1 = Pattern.compile("^[A-Z0-9a-z]*$");

    private static final Pattern p2 =
            Pattern.compile("^(?:(?:[A-Z]\:)?\/)?(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*)+(?:\/(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*))*(?:\.[a-zA-Z0-9]+)?|[A-Z]\:\/?$");

    public static boolean validarPalavra(String palavra) {
        return p1.matcher(palavra).matches();
    }

    public static boolean validarCaminho(String caminho) {
        return p2.matcher(caminho).matches();
    }
}

Finally, it should be noted that on your first validator, you are using \: instead of [A-Z0-9a-z]* (that is, [A-Z0-9a-z]+ instead of * ). This means that it will also accept an empty string. If this is not intentional, then just change + to * . Also, I also added + and ^ to it to mark the beginning and end of the string.

Well, here are some tests:

public class Main {
    private static void testar(boolean resultado, String teste) {
        System.out.println(Validador.validarCaminho(teste) == resultado ? "Ok" : "ERRO");
    }

    public static void main(String[] args) {
        testar(true, "C:/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(true, "C:/home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto.txt");
        testar(true, "/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(true, "/home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto.txt");
        testar(true, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(true, "home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto.txt");
        testar(true, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto");
        testar(true, "home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto");
        testar(true, "home");
        testar(true, "/home");
        testar(true, "C:/home");
        testar(true, "home.txt");
        testar(true, "/home.txt");
        testar(true, "C:/home.txt");
        testar(true, "C:");
        testar(true, "C:/");
        testar(false, "a:");
        testar(false, "a:/");
        testar(false, " home");
        testar(false, "home ");
        testar(false, "home/");
        testar(false, "home.");
        testar(false, ".txt");
        testar(false, "C:home");
        testar(false, "C:home/texto");
        testar(false, "home//texto.txt");
        testar(false, "ho  me/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.");
        testar(false, "home/PauloNeto/NetBeans#Projects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeans  Projects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto..txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.x.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt.");
        testar(false, " E:/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E :/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E: /home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E:/ home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E:/home /PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home /PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, " home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/ PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto /NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto. txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt ");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.t xt");
        testar(false, "E: ");
        testar(false, "E :");
        testar(false, " E:");
        testar(false, "E:/ ");
        testar(false, "E: /");
        testar(false, "E :/");
        testar(false, " E:/");
        testar(false, "");
        testar(false, " ");
        testar(false, "/");
        testar(false, ".");
        testar(false, ":");
    }
}

In all tests the output was "ok".

See here working on ideone.

    
08.11.2015 / 20:47
2

Do this:

public boolean validarCaminho(String caminho) {
       Pattern p = Pattern.compile("[a-zA-Z0-9\.\/]+");
       Matcher retorno = p.matcher(caminho);
       return retorno.matches();
}
    
08.11.2015 / 20:46