Questions tagged as 'web-scraping'

1
answer

How can I maintain only specific DataFrame rows?

I have code that goes into a website, fills in a form and pulls a table, however, I want to delete some rows from this table that I do not need. Let's go to the code: #library's require(RCurl) require(XML) require(stringr) require(plyr) req...
asked by 07.03.2017 / 21:05
1
answer

How to extract web content (Web scraping) with C #?

Recently I learned how to do web scraping and got it on some sites, but in others I can not. I noticed that in some of those I can not get a "#", what does that mean? I'll give you an example of a site where this happens to me. link Is t...
asked by 21.06.2018 / 20:55
0
answers

Does anyone know how to do a Web Scraping on the SICONV (Free Access) website?

I'm trying to extract the information from the siconv site that deals with covenants in R: link link It turns out that when I use R, with the rvest and httr packages, it redirects to the login screen located at link . I tried to...
asked by 02.12.2018 / 01:28
1
answer

How to apply opacity on a DOM element - createImage (); - through a javascript editor?

I'm using p5.js - a javascript library - to capture images from a news API. I would like these images to be superimposed, but opaque, so that the images blend. I'm not able to apply direct opacity to the javascript code. I can change the posi...
asked by 22.06.2018 / 13:36
1
answer

File download from form completion

I'm trying to access a website, fill out your form and download the file, but I'm encountering some difficulties. This is my code so far: #library's require(rvest) #website url <- ("http://www.anbima.com.br/est_termo/Curva_Zero.asp") pg...
asked by 02.03.2017 / 19:18
1
answer

How to ignore links that do not fit the established conditions and continue with scraping?

I'd like to know how to ignore links that do not fit the conditions set forth in title, date_time, and text; thus managing to continue scraping the site. The error that occurs when a link does not have or does not follow the conditions: "Erro...
asked by 01.09.2016 / 01:57
2
answers

WebScrape placar previdencia

I needed to extract the information from this site to an excel file, which MPs vote for, against, abstention, anyway. It's a webspace exc, but as I understand html I'm having trouble understanding the nodes. I've tried read_html , readH...
asked by 07.04.2017 / 22:24
1
answer

Creating a program to get important news on a site

from bs4 import BeautifulSoup import requests url = 'http://g1.com.br/' header = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' 'AppleWebKit/537.36 (KHTML, like Gecko) ' 'Chrome/51.0.2...
asked by 02.08.2016 / 03:28
1
answer

Use lambda expressions to refine the parameters of a for in C #

Good afternoon! I would like to ask a question, I am developing a collection code and at a certain point if it takes a for iterate the values in the list and then save the information in an item ... Until then, however, the problem arises bec...
asked by 13.11.2018 / 14:43
1
answer

Error RSelenium - Selenium message: Java heap space

Hello, I'm trying to scrape the link using RSelenium because the page only generates the information in the html when it is loaded in the browser. Well when this happens I do not know how to do otherwise than with RSelenium. But when the lo...
asked by 27.09.2017 / 19:23