Meaning of?:? =?! ? =?! in a regex

15

In several regex I noticed some symbols that do not seem to be part of the catch but some kind of functionality. I would like to know the name or term of these symbols and what is the functionality of each one.

?:
?=
?!
?<=
?<!
    
asked by anonymous 22.04.2014 / 16:06

2 answers

14

If you are referring to .Net regexes, using the , these symbols can be used when starting a group with parentheses:

Regex + symbols + ... + (

What do they mean:

  • ) Unrecognized group: indicates a group that will not be in the list of captured groups ... note that this will normally be considered within the match, it will not be a group, for example:

    String parsed: ?:
       Regex: abc. 123 xpto<fim>
       Matches: \w+(?:\.|<fim>) , abc.

The others are assertive, without capture, nor do they advance in reading:

  • xpto<fim> Positive Lookahead: This is an assertion, which verifies that the group can start starting in the position it find, but without capturing or advancing in reading the string being parsed, for example:

    Read words that happen before a point ( ?= )
       String parsed: .
       Regex: 123. xpto.
       Matches: \b\w+\b(?=\.) , 123

  • xpto Negative Lookahead: This is an assertion, which verifies that the group can not start starting at the position it is found, but without capturing or advancing the reading of the string being parsed, for example:

    Read words that do not happen before a point ( ?! )
       String parsed: .
       Regex: 123 xpto abc.
       Matches: \b\w+\b(?!\.) , 123

  • xpto Positive Lookbehind: This is an assertion, which verifies that the group can be found ending in the position it find, but without capturing or advancing in reading the string being parsed, for example:

    Read words that happen after a point ( ?<= )
       String parsed: .
       Regex: abc. 123 xpto.
       Matches: (?<=\.\s*)\b\w+\b

  • 123 Negative Lookbehind: This is an assertion, which verifies that the group can not be found ending in the position it is found, but without capturing or advancing the reading of the string being parsed, for example:

    Read words that do not happen after a period ( ?<! )
       String parsed: .
       Regex: abc. 123 xpto.
       Matches: (?<!\.\s*)\b\w+\b , acb

If it's of your interest, I usually use this tool to work with regexes in C #:

link

    
22.04.2014 / 16:15
6

Lookahead is a way to look for strings that have a particular ending or not. It is used ( ?⁼.. ) for the positive, that is, that they end with; and ( ?!.. ) to the negative, that is, that does not end with.

Lookbehind does the same thing as lookahead , however, as the name itself says, look no further than the string. ( ?<=.. ) to the positive and ( ?<!.. ) to the negative.

Example, consider the foobarbarfoo sequence.

bar(?=bar)    encontra o primeiro bar.
bar(?!bar)    encontra o segundo bar.
(?<=foo)bar   encontra o primeiro bar.
(?<!foo)bar   encontra o segundo bar.

You can also combine them:

(?<=foo)bar(?=bar)   encontra o primeiro bar.

See this online tool ( RegExr ) to help you create expressions, such as identifying types, there are also examples.

Here explains in more detail about this.

I will soon update the answer with more information.     

22.04.2014 / 16:32