regex for url validation

1

I've done this regex to validate URL:

const std::regex pattern(
   "(?:(ftp|http[s]?:[//])?)?([w]{3}[.])?"
   "(.*[.](com|php|net|org|br|dk|at|us|tv|info|uk|co.uk|biz|se)?)?"
   "(.*[.](aspx|htm|html|HTM|HTML|jhtm|jhtml|JHTM|JHTML|xhtm|xhtml|XHTM|XHTML)?)?"       
);

Until then you are validating, but in my view this is also a URL:

pt.stackforum.stack/questions/ask

or even:

pt.stackforum.stack/questions/

I tried to include this type of validation in my regex but I could not.

This is the complete program.

#include <regex>
#include <iostream>

#define this (*this)

struct url
{ 
 url(std::string url): _url(url){}

 ~url()
  {
   if(_url.length() != 0)_url.clear();
  } 

  bool isUrl()
  {
    std::smatch url_smatch;
    const std::regex pattern(
                             "(?:(ftp|http[s]?:[//])?)?([w]{3}[.])?"
                             "(.*[.](com|php|net|org|br|dk|at|us|tv|info|uk|co.uk|biz|se)?)?"
                             "(.*[.](aspx|htm|html|HTM|HTML|jhtm|jhtml|JHTM|JHTML|xhtm|xhtml|XHTM|XHTML)?)?"       
                            );
    return std::regex_match(this._url, url_smatch, pattern);
  }

 url* print()
 {
  std::cout<<"\n\tUrl: "<<this._url<<(url(this._url).isUrl()?"\n":" is Invalid\n");
 }

  private:
  std::string _url;
};

int main()
{
  const std::string urls[11] = {"http://docs.microsoft.com/pt-br/sql/reporting-services/tools/url-examples-for-items-on-a-report-server-sharepoint-mode/",
                                "https://www.uolhost.uol.com.br/faq/v2/loja-virtual/o-que-e-um-endereco-de-url.php",
                                "https://www.uolhost.uol.com.br/index.asp",
                                "www.uolhost.uol.com.br/INDEX.htm",
                                "ftp://ftp.uolhost.uol.com.br/index.html",
                                "ftps://uolhost.uol.com.br/downloads/source",
                                "ftp.uolhost.uol.com.br/downloads/source/",
                                "ftp.uolhost.uol.com.br/downloads/source/index.jhtml",
                                "ftp.uolhost.uol.com.br/downloads/source/index.xhtml",
                                "ftp.uolhost.uol.com.br/downloads/source/index.aspx",
                                "ftp.uolhost.uol.com.br/downloads/source/index.XHTML"}; 

    for(int i=0; i<11; i++)
    url(urls[i]).print();
}
    
asked by anonymous 27.12.2017 / 22:45

1 answer

0

Try this one below, done and tested on the Regex tester to accept url local and ip

^[a-zA-Z0-9-_]+[:./\]+([a-zA-Z0-9 -_./:=&"'?%+@#$!])+$

I tested all the values below and it worked out

http://server/dev/Formularios/Cadastro_Usuario/Professor/Atualizar/Usuario.php?usuario_id=1245&nome=Ricardo%20Teixeira%Souza&cidade=Itabuna
https://www.google.com.br/search?q=%2520%25+space&oq=%2520%25+space&aqs=chrome..69i57j0l5.9863j0j7&sourceid=chrome&ie=UTF-8
localhost:8080
localhost/user/test.com/index.php
C:\Windows
google.com
ftp://teste.com
192.168.0.1
teste_server.com
teste-server.com
    
28.12.2017 / 20:30