The syntax for doing this with wget
o php , the answer is based on this language, you can try the following:
Download page content file_get_contents
or cURL
, or another form you know about.
Extract the links from the page, you can use preg_match
or parse HTML with DOMDocument
.
Download the file from a URL, you can use file_put_contents
or cURL
in conjunction with the fopen
to open the file for writing.
Do the following:
To download page content, cURL
:
function obterPagina($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
return curl_exec($curl);
}
Note : You can include more options depending on the need. More information .
To extract the links from the page, with DOMDocument
:
function obterLinks($url, $pagina, $extensoes = ['gif', 'jpg']) { // Extensões aceitas
$dom = new DOMDocument;
$links = [];
if ($dom->loadHTML($pagina) !== false) {
foreach ($dom->getElementsByTagName('a') as $link) { // Percorre todos os elementos com a tag "a"
$href = $link->getAttribute('href');
$extensao = pathinfo($href, PATHINFO_EXTENSION);
if (in_array($extensao, $extensoes)) {
$links[] = $url . $href;
}
}
return $links;
}
return false;
}
To download the file, also cURL
:
function baixarArquivo($url, $salvarComo, $timeout = 3600) {
$curl = curl_init();
$fp = fopen($salvarComo, 'w'); // Abre o arquivo para escrita
if (!$fp)
return false;
$opts = array(CURLOPT_URL => $url,
CURLOPT_FILE => $fp,
CURLOPT_TIMEOUT => $timeout); // Define o timeout, o padrão é 1 hora
curl_setopt_array($curl, $opts);
$ret = curl_exec($curl);
curl_close($curl);
fclose($fp);
return $ret !== false;
}
To use, do so:
$url = "http://www.tarararara.com/images/";
$pagina = obterPagina($url);
if ($pagina) {
$links = obterLinks($url, $pagina);
if ($links) {
foreach ($links as $link) {
var_dump( baixarArquivo($link, basename($link)) ); // Salva na mesma pasta do script
}
}
}