How to run a crawler in Python with PHP

1

I made a crawler with Python and run it through the command line:

python crawler.py

As soon as I run this command it asks me for the keyword that will be searched for and start running.

global keyword
keyword = raw_input('Keyword: ')

This data is used on a Laravel / PHP platform. But I would like to create a button type "Refresh" so that the user could just run the crawler by clicking on some button and entering the keyword.

    
asked by anonymous 27.01.2017 / 13:45

1 answer

2

If you are creating the crawler or if it is simple, maybe it is better to write entirely in PHP, but it is just a suggestion.

Now if you want to send an "input" to another program instead of using system , exec or shell_exec , you should use popen , which will allow you to "chat" with the stream, send commands to programs as telnet (just an example)

  

You do not need fgets or fread, the popen itself when closed sends the result to output ( STDOUT ), if you want to capture the output you can then use ob_start

An example should look like this:

<?php

$input = 'Comando enviado';

$dir = dirname(__FILE__); //Apenas para pegar a pasta relativa

$comando = 'python ' . $dir . '/crawler.py'; //Gera o comando

$handle = popen($comando, 'w'); //inicia o processo

fwrite($handle, $input);

pclose($handle);

And an example of Python (here I use 3.6, but just change input to raw_input if it's Python2):

import sys

keyword = input('Keyword: ')

print(keyword)

However, if you use it with other programs or systems, you may face some problem, so for windows you may use start /B and like-unix can use 2>&1 , like this:

<?php
$dir = dirname(__FILE__);

$comando = $dir . '/crawler.py';
$input = 'Olá mundo!';

if (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN') {
    $comando = 'start /B python ' . escapeshellarg($comando);
} else {
    $comando = 'python ' . escapeshellarg($comando) . ' 2>&1';
}

$handle = popen($comando, 'w'); //inicia o processo

fwrite($handle, $input);

pclose($handle);
    
27.01.2017 / 14:00