How to make a regex that ignores non-alphanumeric characters?

1

For example: a regex that finds correpondencia in the string "coin" but that also finds correspondence in the string (m, .o, and d ... a). In other words, it ignores non-alphanumeric characters regardless of position or quantity.

NOTE: I CAN KNOW THE FOLLOWING FORM BUT THE IDEA IS ONLY DIGITED 1 TIME.

 #=\W*
  

M # o # and # d # a

    
asked by anonymous 14.09.2017 / 18:55

5 answers

2

If I understood the term "capture", what you want is to remove the non-alphanumeric characters , use replace , the deny regex should look like this:

[^a-z0-9]

The sign of ^ within [...] deny any character, then replace will remove all those that are not within [^....]

In JavaScript you should use with the modifier global called /.../g and with /.../i if you need case-insensitive, example:

var str = "m,.o,e.d...a";
var resposta = str.replace(/[^a-z0-9]/gi, "");
console.log(resposta);

In PHP it would be like this, with preg_replace :

$str = "m,.o,e.d...a";
$resposta = preg_replace('#[^a-z0-9]#', '', $str);

var_dump($resposta);

Example online at ideone

Note:

It's important to note that if you want to add more characters to not be removed, such as spaces, just add within [^....] , example that captures alphanumeric characters and spaces:

var str = "m,.o,e.d...a ,.n,.a,. ,.,.c,.a,.r,.t,.e,.i,.r,.a";
var resposta = str.replace(/[^a-z0-9\s]/gi, "");
console.log(resposta);

Capturing into an array

If you really want to capture, then the correct one is to use .match in JavaScript and preg_match in PHP, regx would also change to something a bit more complex considering that it is a string with different words and you want capture all, then it has to be something like this:

(^|\s)([a-z0-9]*[^\s]*)(\s|$)

JavaScript example:

var str = "m,.o,e.d...a ,.n,.a,. ,.,.c,.a,.r,.t,.e,.i,.r,.a";
var respostas = str.match(/(^|\s)([^\s]+?)(\s|$)/gi, "");
var allowAN = /[^a-z0-9]/gi;

for (var i = 0, j = respostas.length; i < j; i++) {
    respostas[i] = respostas[i].trim().replace(allowAN, "");
}

console.log(respostas);
    
14.09.2017 / 19:47
1

Use only \w+ This will marry all characters in the intervalode and A-Za-z0-9 one or more times.

console.log('m#o#e#d#a'.match(/\w+/g));
console.log('m,.o,e.d...a'.match(/\w+/g));
    
14.09.2017 / 19:05
0

Use this regular expression:

[0-9a-zA-Z]

Test here

    
14.09.2017 / 19:03
0

I'm not sure if I understand the request, but

var s="m.o...e,!d--a";
console.log(s.replace(/\W/g,"").match(/moeda/) ? "y":"n")

(that is, first remove the "non-letter" and then look for "currency") can be used to detect if a string contains "currency".

    
14.09.2017 / 19:40
0

Can not do what you want by using regex.
In a regular expression you must designate what you want to find through a logical sequence of tokens and quantifiers. In your case you want to search for a certain sequence, but you do not want to use quantifiers or tokens to ignore what is between the sequence (as mentioned in the comment of that response)
That is, the only way to do this would be to add a global flag in the regex as \x (which ignores whitespace in searches), however there is no such flag for your case, as the result you want to achieve can be obtained through of tokens and quantifiers, eliminating the need to create a flag for this.

The result you want can only be obtained through this RegEx:

(m\W*o\W*e\W*d\W*a)

You can test it here

Explanation

  • () Represents the catch group that its regex will return if the condition is satisfied
  • m is the character that must be found first to start the regex match
  • \W* represents that there can be between 0 and infinite characters that are non-alphanumeric
  • o other character that must be after "m" for condition to be satisfied
  • \W* has the same effect as the previous
  • and thus follows the regex until it finds e d and a even though they contain non-alphanumeric characters between them
14.09.2017 / 19:44