But this is basically what the crawler should do, it will be up to you to use a database with the list of sites you want to scan and a cron to schedule the scans, every cron instead to schedule the In this script you would pass the argument of the site you want to scan, for example: $crawler->setURL($argv[1])
.
Do not expect a single php request to process numerous sites, this will be very bad for your server, Google, Yahoo, Bing periodically scan different sites and routines and probably they have a scanning limit of one site per hour and continue only later.
If only one request and one php script tried to access multiple urls, the application would be in a long process that could take hours and depending on the # PHP would not be able to clear the usage which would cause the processor or memory consumption to increase until your server starts to hang .
The most appropriate (not necessarily correct) way is to scan one site at a time and place a limit and try to continue where you left off if you use the limit. Remember there are sites that can have more than 50,000 pages.