Validate domain preg_replace

2

My question is as follows, I have the following function:

$site = preg_replace("/[^a-zA-Z0-9]/", "", $_POST['site']);

But when using the string www.lostscavenge.com , this regular expression issues the following return:   wwwlostscavengecombr

How do I allow this function to allow and do not remove the points?

    
asked by anonymous 15.01.2017 / 23:48

2 answers

2

Domain names can currently be unicode (utf8). Depending on the rules of your business model, if you need to allow domains that have non-ASCII characters, the following routine can be useful:

function validate_domain_name($str, $force_utf8 = true)
{
    $force_utf8 = $force_utf8? 'u': '';

    //Isso é ineficiente.
    //$re = '[^a-zA-Z0-9\.]';

    //Isso é ineficiente. Pois não valida normas básicas
    //$re = '^(http[s]?\:\/\/)?((\w+)\.)?(([\w-]+)?)(\.[\w-]+){1,2}$';

    //Esse é mais consistente
    $re = '^(?!\-)(?:[\w\d\-]{0,62}[\w\d]\.){1,126}(?!\d+)[\w\d]{1,63}$';

    if (preg_match('/'.$re.'/'.$force_utf8, $str, $rs) && isset($rs[0]) && !empty($rs[0])) {
        return $rs[0];
    } else {
        return null;
    }
}

$str = '000-.com';
$str = '-000.com';
$str = '000.com'; // válido
$str = 'foo-.com';
$str = '-foo.com';
$str = 'foo.com'; // válido
$str = 'foo.any'; // válido
$str = 'お名前0.com'; // válido
$str = 'お名前.コム'; // válido

echo 'domain: '.validate_domain_name($str);

To disable unicode, set the second parameter to Boolean false .

The original regular expression has been adapted from this response: link

The adaptations I made were to change a-zA-Z to \w and add the option to include the u flag, which allows non-ASCII characters.

    
16.01.2017 / 09:11
1

Just add the dot to your pattern:

$site = preg_replace("/[^a-zA-Z0-9\.]/", "", $_POST['site']);

This is \. at the end of the pattern. The \ is to escape the period (.), Because in regular expressions . means any character

    
16.01.2017 / 00:00