The end of a catch in a regular expression

3

I do not know which title could get better than this, as I do not quite understand technical terms related to the regular expression.

But I will describe my problem. I have a code, where I generate a regular expression to be able to capture a certain expression and then transform it into a valid php code.

For example, the following string:

%[ $variable = 1 ]

generates the following code:

<?php $variable = 1; ?>

I can make it work. If you put expressions like this below work correctly:

 %[ foreach ($array as $key => $value) ]
 %[ endforeach ]

I'm having problems when the captured expression is on the same line.

An example:

[% echo "Esse é o Wallace"] [% "esse é meu nome" ]

Output generates this:

 <?php echo "Esse é o Wallace" ][% "esse é meu nome ?>

The regular expression I use to do this is generated by a class that mounts it as follows (based on sprintf, to improve visualization)

$exp1 = '%[';

$exp2 = ']';

$regexp = sprintf('/%s\s*(.*)\s*%s/', preg_quote($exp1), preg_quote($exp2));

preg_replace($regexp, '<?php $1 ?>', $meu_codigo_aqui);

What would be the expression:

 '/%\[\s*(.*)\s*\]/'

It is understandable that in the case where the result was not expected, the expression only recognizes the last ] as being the end.

But what I want is for this same regular expression to return the data as follows:

 %[ echo "Expressão 1"] %[echo "Expressão 2"]

 <?php echo "Expressão 1" ?> <?php "Expressão 2" ?>

How can I do this?

I would like, when my expression ends with ] and has another beginning with %[ , these two groups do not mix, but each one is interpreted separately.

    
asked by anonymous 29.01.2016 / 13:11

1 answer

5

It's simple and you already asked a question about it :

Change to:

 '/%\[\s*(.*?)\s*\]/'

The operator is not greedy.

Addendum

Be careful when using the \s , many people use it thinking that it means only the ' ' (space) character. You may end up capturing more things you do not want.

You also commented on the s modifier for the regular expression to consider row by row. However this is not correct. The line-by-line modifier is m .

  • s = simple line. It will consider only the first line, and the rest will be ignored.
  • m = multi line. This will treat each new line as a new sentence.

Regarding REGEX in PHP try to change the modifier start / end / , because if it is necessary to capture a / literal it will be necessary to escape it.

  • /http:\/\// , it was necessary to escape / , since PHP could have considered it as the end of the regex. which would generate error.
  • ~http://~ , it was not necessary to escape / since the start / end modifier is ~ .

Generally, I use ~ , because you do not try to capture it, but remember that this can be any non-alphanumeric character [^a-zA-Z0-9] .

    
29.01.2016 / 13:17