What exactly is the "u" modifier for?

4

What exactly does the u modifier in regular expressions from preg_ to PHP ?

Should I use it whenever I process strings that have accented characters?

$valor = 'ãẽi ouã';
preg_match('/\w+/u', $valor, $matches);

$matches; // 'array(2) { ãẽi, ouã}
    
asked by anonymous 23.07.2015 / 20:47

1 answer

2

This /u f modifier is for unicode support.

For example if you want to make a regex with Japanese words you need to use it.

preg_match('/[\x{2460}-\x{2468}]/u', $str);

Where \x{hex} - is a char-code hexadecimal UTF-8.

Running the following regex:

$valor = 'ãẽi ouã';
preg_match('/\w+/u', $valor, $matches);

returns:

array (
  0 => 
  array (
    0 => 'ãẽi',
    1 => 'ouã',
  ),
)

Running the following regex (without the modifier):

$valor = 'ãẽi ouã';
preg_match('/\w+/', $valor, $matches);

returns:

array (
  0 => 
  array (
    0 => '�',
    1 => '��',
    2 => 'i',
    3 => 'ou�',
  ),
)

Should not be used to get accented vowels examples:

$valor = 'ãẽi ouã';
preg_match('/a/u', $valor, $matches);

returns:

array (
 0 => 
  array (
  ),
)

Test site: Link

Documentation: Link

    
23.07.2015 / 21:08