How to select all characters except some specific words using regex?

7

Good morning everyone,

I would like to perform a string search for some specific character strings, but in the project we are working with we do not have access to any find function or similar, only to a regular expression-based substitution function (something like replace(string, regex, replacement) ).

The idea would then be: select all the characters EXCEPT the sequences that I want to find. So I would remove these unwanted characters and compare them to what I want to find.

Example (not a specific language):

string expReg = ??????;
string texto = "xxxxxxxxboloxxxxxxxfarinhaxxxxxxacucarxxxx";
string busca = replace(texto, expReg, "");
if(busca == "bolofarinhaacucar"){
    return("Sucesso");
}

Luckily the words we need to find must be in the defined order, so it would not be necessary to include all the permutations.

We tried to find some solution using regular expressions, but we always run into the problem that positive lookbehind (?<=ABC) is not supported in Javascript.

Any ideas?

    
asked by anonymous 11.02.2015 / 14:27

2 answers

3

It would be nice to know the language you are working on to know a more efficient way to help you since denying the word in regex is not easy

link

As you said it will always be these specific words can do something like this

expReg = "\w+(bolo)\w+(farinha)\w+(acucar)\w+"

string texto = "xxxxxxxxboloxxxxxxxfarinhaxxxxxxacucarxxxx";
string busca = replace(texto, expReg, "$1$2$3"); //substitui pelo grupo 1, 2 e 3

- You can test here: link

    
11.02.2015 / 16:16
3

To remove a certain character from a sequence, simply make the following substitution (example in Javascript):

 
var texto = 'xxxxxxxxboloxxxxxxxfarinhaxxxxxxacucarxxxx';

var expReg = /([x]+)/g; // Vai procurar por uma ou mais ocorrências de "x"
var busca = texto.replace(expReg, '');

if(busca == "bolofarinhaacucar"){
    console.log("Sucesso");
}

DEMO

To deny matching ) of certain characters, simply use the denied character class using ^ at the beginning of the class.

var texto = 'xxxxxxxxboloxxxxxxxfarinhaxxxxxxacucarxxxx';

var expReg = /([^x]+)/g;
var buscaArray = texto.match(expReg, '').toString(); // match retorna um array com os valores encontrados
var busca = buscaArray.split(",").join(""); // Transformamos para string para poder comparar

if(busca == "bolofarinhaacucar"){
    console.log("Sucesso");
}

DEMO

Maybe the syntax changes depending on the engine used. Here ( in English ) shows a comparison between > engines of regular expressions.

    
11.02.2015 / 15:59