How to create a robot with PHP? [closed]

Question

How to create a robot with PHP? [closed]

Navigation

#1 by (4 votes)

2

What is the best way to create a Robot in PHP?

The purpose of the robot, is to access a URL, with login and password, enter data in certain fields, submit this data, and interpret the result.

This so that you can update an internal database, according to the result returned from the query.

The data to be submitted will come from another database.

Any suggestions?

php

asked by anonymous 01.04.2014 / 18:21

1 answer

How to do Search and sort of TemplateFields in a GridView? Move image inside a picturebox with the mouse without using the scrollbar

score 4 · Accepted Answer

Robots to search for and interpret information on other pages are also called web crawlers or spiders.

These are scripts that perform the following process:

Request for a URL.

Store the returned return in a variable.

Interpret the return, that is, perform the HTML parser.

Search for relevant information.

Perform the processes with the information obtained.

The process in steps 1 through 3 is easily solved as follows:

$url = 'www.exemplo.com';
$dom = new DOMDocument('1.0');
$dom->loadHTMLFile($url);

In this way you will get an object that will allow you to navigate through HTML as needed.

For example, to get all the links on a page and display the addresses would look like this:

$anchors = $dom->getElementsByTagName('a');
foreach ($anchors as $element) {
    $href = $element->getAttribute('href');
    echo $href . '<br>';
}

An interesting class that can aid in handling HTML and avoiding thousands of lines of code is the Simple HTML DOM , and a tutorial teaching how to use it can be found on Make Use Of .

In order to fill a form, it is enough to make a request for the URL that the form points to using the expected request method, that is, to request the URL present in the action attribute using the request method present in the method .

To simulate the situation we will change the previous requisition code to:

$curl = curl_init();
// Set some options - we are passing in a useragent too here
curl_setopt_array($curl, array(
    // Retorna o conteúdo como string
    CURLOPT_RETURNTRANSFER => 1,
    CURLOPT_URL => 'http://www.exemplo.com',
    // Nome de identificação do seu robô
    CURLOPT_USERAGENT => 'Nome do seu crawler',
    // Indica que a requisição utiliza o método POST
    CURLOPT_POST => 1,
    // Parâmetros que serão passados via POST
    CURLOPT_POSTFIELDS => array(
        item1 => 'value',
        item2 => 'value2'
    )
));

// Fazendo a requisiçnao e salvando na variavel $response
$response = curl_exec($curl);

// Finalizando o objeto de requisição
curl_close($curl);

$dom = new DOMDocument('1.0');

// Realiza o parser da String de retorno da requisição
// Observe que o método mudou de loadHTMLFile para loadHTML
$dom->loadHTML($response);

Learn more about CURL