Questions tagged as 'web-scraping'

1
answer

How to recognize and change the encoding of Latin characters in R?

Is there any efficient way to recognize the encoding of texts downloaded from the internet? I did a scraping of any site (see code below) and I can not find the correct encoding. In the source code META tag the specification is "iso-8859-1" (...
asked by 01.05.2016 / 19:34
1
answer

How to do webscrapping from an https using rvest?

I'd like to scrape a page that is on https using the rvest package. However, this is a site with security certificate issues. In these cases, you need to turn off SSL verification - but I do not know how to do this in that package. No...
asked by 16.12.2015 / 17:00
3
answers

Web scraping with R

I'm trying to do a Web Scrapping of the following link: link I want to access all categories and extract a date frame with the name of all companies. If you click on the name of any of the companies will have some data like: Fancy na...
asked by 21.01.2016 / 13:28
1
answer

Extract information from lattes

Introduction Brazilian researchers have, since 1999, a website where they can post information about their academic career. This information is known as Currículos Lattes . I want to download a few thousand of these resumes and write, along...
asked by 18.04.2018 / 17:32
1
answer

Generate links and download content programmatically

I would like to know how I would collect data from a website. The site is link . There I have to download all the data from operation history from power generation to Natural Energy Influence. The problem is that within each data series, you...
asked by 14.12.2015 / 14:56
1
answer

How to do the webscrapping of a site that has method post?

I'm having trouble doing webscrapping for sites that use the post method, for example, I need to extract all news related to political parties from the site: link . Below is a schedule I made of a journal that uses the get...
asked by 31.05.2016 / 18:18
1
answer

Web Scraping: How to change the value of a drop-down button of a site using R?

I want to create a script in R to read an HTML table. Doing this from a static page with the rvest package is easy, the problem is that I have to change the value of two buttons on the page. The site is this here . Note that above the...
asked by 21.06.2016 / 05:34
2
answers

render specific part of a page

I'm using the following code to render a webpage: import dryscrape # set up a web scraping session sess = dryscrape.Session(base_url = 'http://www.google.com.br') # we don't need images sess.set_attribute('auto_load_images', True) # visit s...
asked by 26.05.2015 / 16:08
1
answer

Web Scraping Selenium + Python in site with dynamic generation via JS = difficulty to map elements

Good afternoon. I'm developing a script that: access a system; Within the environment, you find certain information; generates a kind of report; creates a spreadsheet with the data. My problem is still before parse. I can access the...
asked by 28.06.2017 / 21:23
1
answer

Error with requests with scrapy

I have a csv file with some urls that need to be accessed. http://www.icarros.com.br/Audi, Audi http://www.icarros.com.br/Fiat, Fiat http://www.icarros.com.br/Chevrolet, Chevrolet I have a spider to do all the requirments. import scrapy i...
asked by 09.09.2016 / 15:19