How to detect if my site was visited by a search engine?

0

I'm using php and I saw something about the $ _SERVER ['HTTP_USER_AGENT'] variable, but I do not know how to detect the visit from all the search engines. I would detect any bot search and send those bots the information they need via http header. That is, my site will not have a physical robots.txt file.

    
asked by anonymous 22.10.2014 / 02:43

2 answers

2

The most complete and valid option I've found so far was this:

function isBot(){
    if( isSet($_SERVER['HTTP_USER_AGENT']) && preg_match('/bot|crawl|slurp|spider/i', $_SERVER['HTTP_USER_AGENT']) ){
        return TRUE;
    }
    else{
        return FALSE;
    }
}
    
22.10.2014 / 14:54
2

If you use $ _SERVER ['HTTP_USER_AGENT'], this means that you want to put a test on each page. Type:

 $a = $_SERVER['HTTP_USER_AGENT'];
 if ($a == motor de busca)
 {
    // Vamos sair daqui
 }
 // se chegamos aqui, e porque nao e um motor de busca, então podemos continuar

The difficulty and the test. You have 2 options:

  • You only want to authorize one type of browser. For example, you want to be the only one to have access. In this case, you will take the test type: if the HTTP_USER_AGENT = my browser, fine, if not bye bye! Easy because you know the HTTP of your browser.

  • You want to ban access to the engines. But in this case you need know the HTTP_USER_AGENT of the engines ... I think it's impossible because has a lot and has no rule about it.

  • For example here are the HTTP_USER_AGENT of 4 "bots" (search engine).

    Mozilla / 5.0 (compatible; Baiduspider / 2.0; + link )

    Mozilla / 5.0 (compatible; Exabot / 3.0; + link )

    msnbot-media / 1.1 (+ link )

    TurnitinBot / 3.0 ( link )

    They are very different from each other, and to check in PHP that they are search engine, I find it very complicated. You need to find another option.

    One question: what is the goal really? Safety? Privatity?

        
    22.10.2014 / 12:58