Regular expression to find word exception

3

I'm looking for a particular recursive function which I can not remember the name of. For this I set up the following ER:

function (\w+)\([\x00-\xFF]+\(

You will necessarily search for all functions that call their name (recursive). I am aware that the [\x00-\xFF]+ snippet can bring me unexpected results like:

function rl(){
    // code
}

function teste(){
    rl();
}

However, this is irrelevant.

My problem is to deny certain names such as index , busca , edit , to minimize my results.

Currently my search finds 900 results, in which, I believe, some 70% of them refer to these functions.

Failed attempts:

function ([^(index|busca|edit)])\(.*\)\{[\x00-\xFF]+\(
function ((?<!index)\w+)\(.*\)\{[\x00-\xFF]+\(
    
asked by anonymous 12.11.2014 / 14:56

3 answers

1

With the help of the reference given by jsantos1991. And from the testing site link

Using ER over ER. I came to the result:

(?!function (index|edit|busca))(function (\w+)\(.*\)[\x00-\xFF]+\()

In the first part:

(?!function (index|edit|busca))

A search for anything other than "function index" or "function edit" or "function search" is performed. In which we already have our first group: (index|edit|busca) nosso

The second group is ER itself: (function (\w+)\(.*\)[\x00-\xFF]+\() nosso

On the second ER:

(function (\w+)\(.*\)[\x00-\xFF]+\()

we have the third group: (\w+) nosso

In it one looks for, as said in the question, functions that make references to themselves.

In conclusion, the second ER looks for the functions and the first tells which ones to not capture.

    
12.11.2014 / 16:52
1

I've got a jumbled javascript function:

function localizaRecursoes(codigo) {
    var regex = new RegExp("function[\s]+([a-zA-Z][a-zA-Z0-9_]*)[\s]*\(.*\)[\s]*\{[\x01-\xFF]*\1\(", "g");
    var resultado = [];
    var match = null;
    do {
        match = regex.exec(codigo);
        if (match != null && match[1].indexOf("index") == -1 && match[1].indexOf("busca") == -1 && match[1].indexOf("edit") == -1) {
            resultado.push(match[1]);
        }
    } while (match != null);
    return resultado;
}

To test it:

localizaRecursoes("function foo() { foo(); } function xoom() { xoom(); } function foq() { hghf(); } function ga() { ga(); } function buscaX() { buscaX(); } function yy() { yy(); } function feq() { hghf(); } function fre() { ghghgh fre(); dfsfdsf }");

Result:

["foo", "xoom", "ga", "yy", "fre"]
    
12.11.2014 / 16:01
0

Well, you did not choose a specific language so I'll be doing with PHP for more familiarity but the ER itself I believe to be functional in other languages as long as they support lookaround assertions and if necessary, receive the appropriate language-specific adjustments:

/function ((?!edit|busca|edit)\w+)\((.*?)\)\{[\x00-\xFF]+\1\(\2\)[\x00-\xFF]+\}/

A word ( \w+ ) that is not one of the forbidden ( (?!(palavra|palavra|palavra)) ) is married.

Then the parentheses are matched with anything inside for a possible list of arguments. You can remove it if you do not need them.

Then the bounding keys of a block of code are married, and within them any character ( [\x00-\xFF]+ ), followed by our previously married function ( \1 ), the parentheses and their contents (also removable) and anything new, so the function can appear anywhere in the code block.

The tests:

$str1 = 'function rl($a){
    rl($a)
}';

$str2 = 'function rl($a){
    //code
}';

$str3 = 'function index($a){
    anotherfunction($a)
}';

$str4 = 'function edit($a){
    edit($a)
}';

$str5 = 'function rl(){
    edit();
}';

preg_match( '/function ((?!edit|busca|edit)\w+)\((.*?)\)\{[\x00-\xFF]+\1\(\2\)[\x00-\xFF]+\}/', $str1, $m1 );
preg_match( '/function ((?!edit|busca|edit)\w+)\((.*?)\)\{[\x00-\xFF]+\1\(\2\)[\x00-\xFF]+\}/', $str2, $m2 );
preg_match( '/function ((?!edit|busca|edit)\w+)\((.*?)\)\{[\x00-\xFF]+\1\(\2\)[\x00-\xFF]+\}/', $str3, $m3 );
preg_match( '/function ((?!edit|busca|edit)\w+)\((.*?)\)\{[\x00-\xFF]+\1\(\2\)[\x00-\xFF]+\}/', $str4, $m4 );
preg_match( '/function ((?!edit|busca|edit)\w+)\((.*?)\)\{[\x00-\xFF]+\1\(\2\)[\x00-\xFF]+\}/', $str5, $m5 );

var_dump( $m1, $m2, $m3, $m4, $m5 );

Only the first house something, therefore:

  • In the second the function is not called recursively
  • In the third we have a forbidden name
  • In the fourth we have a forbidden name and a recursion of a forbidden name
  • In the fifth we have a valid name, but without having the function called recursively
12.11.2014 / 17:46