Check if a link is active or broken [duplicate]

2

I would like to know how I can use PHP to check if the link of download servers like Mega, Google, Filehero among others are active if they are not presenting the result of link offline .

In this case, this is offline , ,

how can I tell if a particular link is off ? programmatically.

I found several codes to validate the existence of pages in the internet, with everything not valid links of servers.

I found this code on the internet, but it does not work properly because even if the link is online the code puts the result as a broken link how can I fix this in this code? Or someone could give me one that works for any download link from mega, 4shared, minhacca etc ...

<?php 

$url = 'http://mega.co.nz/#!0Q9zGIwb!v_CAoVPESQ9TExR7H66kA_ZPjjaZCZtBUHZE5_OmcIc'; 
$result = @file_get_contents($url); 

// verifica se a url existe 
if ($result !== false): 
// procura pelo id do formulário catcha id='captchaform' 
$pos = stripos($result, 'captchaform'); 

// se encontrar o id='captchaform' então é a página dos downloads 
if ($pos !== false): 
echo 'Url On'; 
endif; 

else: 
echo 'Url off!'; 
endif; 

?>
    
asked by anonymous 02.06.2015 / 17:12

2 answers

2

One way to do this is to make a request and verify that the response code is 200 , which indicates that the request was successful.

See an example using library cURL :

/*
  Argumentos:
    $url:    A URL a ser verificada
    $limite: Define o tempo limite. É opcional, o padrão é 25s
  Retorno:
    true:  Se a URL estiver disponível
    false: Se a URL estiver quebrada
*/
function verificarLink($url, $limite = 25){
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);        // Inicia uma nova sessão do cURL
    curl_setopt($curl, CURLOPT_TIMEOUT, $limite); // Define um tempo limite da requisição
    curl_setopt($curl, CURLOPT_NOBODY, true);     // Define que iremos realizar uma requisição "HEAD"
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, false); // Não exibir a saída no navegador
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); // Não verificar o certificado do site

    curl_exec($curl);  // Executa a sessão do cURL
    $status = curl_getinfo($curl, CURLINFO_HTTP_CODE) == 200; // Se a resposta for OK, a URL está ativa
    curl_close($curl); // Fecha a sessão do cURL

    return $status;
}

Example usage:

$link = "http://mega.co.nz/#!0Q9zGIwb!v_CAoVPESQ9TExR7H66kA_ZPjjaZCZtBUHZE5_OmcIc";
$status = verificarLink($link);
if ($status) {
    echo "O link fornecido está disponível!";
} else {
    echo "O link fornecido está quebrado.";
}
    
03.06.2015 / 14:33
0

If you look at the URL, it ends with #! and a bunch of strange characters: #! indicates that, in fact, who redirects the user to the right place is a bit of JavaScript that runs in the user's browser . You could run the page's JavaScript using something like phpjs , but the most sensible solution is to try to convert those URLs using < a href="https://developers.google.com/webmasters/ajax-crawling/docs/getting-started"> _escaped_fragment_ of Google (in this case, you would try to access http://mega.co.nz/?_escaped_fragment_=0Q9zGIwb!v_CAoVPESQ9TExR7H66kA_ZPjjaZCZtBUHZE5_OmcIc , but not% it's clear to me if mega.co.nz follows Google's specification.)

    
03.06.2015 / 17:41