Questions tagged as 'web-crawler'

1
answer

Do search engine crawlers / bots / web-spiders copy and access the href of a link, or "click" on the page to be redirected?

I have this doubt, because I want to develop a portal in Ajax, but that the pages can also be accessed via url. My question is: If <a> </a> has return false clicked, web-spider will not be able to follow href...
asked by 05.03.2017 / 15:35
1
answer

Make request on a page with Guzzle

I'm having trouble placing a POST request on a site through the Guzzle component. The target site is: link He even enters the site but no results appear. I do not know if the problem is in HOW I make the request or if it is the PARAMET...
asked by 29.05.2015 / 20:05
0
answers

Problem collecting links from a site

Expensive, good morning! I'm writing a program in Python to collect links from a website. The part of the code that collects the links is: links = driver.find_elements_by_xpath('//*[@href]') for link in links: print(link.get_attribute('hre...
asked by 31.10.2018 / 13:48
1
answer

How to handle ERR_CONNECTION_TIMED_OUT error in webscraping using a list

I'm learning node.js and got with the help of @Sorack joking with a webscraping. With respect to the code the following happens when the statusCode of the page is equal to 200 the page returns the information and generate it in the file as re...
asked by 02.10.2018 / 22:12
1
answer

How to name each row in a url list

Can I name each line in the url list, to return the nickname I gave it to? Type so the result was this: Prefeitura Municipal de Bocaiúva do Sul | PRONIM TB 518.01.07-013 | Prefeitura Municipal de Matinhos | PRONIM TB 518.01.04-000 | | PR...
asked by 01.10.2018 / 22:25
1
answer

Crawler - how to access several pages

I've put a code on the node to search for the system version and the name of the municipality of a portal, but I'm not able to get it to fetch the information of another municipality from just one. On request I would like it to loop and acces...
asked by 28.09.2018 / 20:29
1
answer

How can I use Scrapy on Anaconda

Hello, I'm having trouble creating a project with Scrapy. I'm studying data science in college and I have to use Scrapy. I'm using the Anaconda. First through the Spider IDE (Anaconda Navigator), I am now trying for the same prompt. The problem...
asked by 18.09.2018 / 02:58
0
answers

SEO - Single Page Application via parameters

Good morning Dev's !! I'm starting with the subjects of indexing, SEO, crawling ... And I do not find here in the forum or on another portal about this case of mine. My site is a SPA made from Anchors HTML. Declaring: <a href="#!home...
asked by 25.05.2018 / 16:53
0
answers

Web Crawler Proxies Settings for Google

Alright? I have a big problem with automating word search on Google. Would you like some help, who has already done it, how should you use it? I have a machine on DigitalOcean in Toronto (Canada) are configured with Squid 3.0, however I...
asked by 10.05.2018 / 16:02
0
answers

How to get the next strong value with jquery?

Given that I have the following HTML structure, how could I get the value, after </strong> and before <br> <strong>Categoria: </strong>Padaria &amp; Panificação<br>     
asked by 12.05.2018 / 21:33