Shaving list on site with beautifulsoup

1

I need to scrape into Python a list on a website. Only the first list My code looks like this:

import requests

from bs4 import BeautifulSoup

page = requests.get("http://www25.senado.leg.br/web/atividade/materias/-/materia/votacao/2363507")

soup = BeautifulSoup(page.content, 'html.parser')

lista = soup.find_all('ul' , class_='unstyled')

You are scraping all lists. I want to shave the voting list of the description "Nominal vote, in the first round, of PEC nº 55/2016, which amends the Act of Transitional Constitutional Provisions, to establish the New Tax Regime, and gives other measures (Ceiling of Public Expenses). "

But all lists have tag ul and class unstyled Does anyone know how to differentiate the lists?

I searched a little later, I read this site: link

I think this is it: lists = soup.find_all ('ul', class _ = 'unstyled', limit = 2)

    
asked by anonymous 15.08.2017 / 21:45

1 answer

0

Dude already tried to use Selenium?

pip install selenium
brew install phantomjs

Code that does the same thing you need.

from selenium import webdriver

browser = webdriver.PhantomJS()
browser.get("http://www25.senado.leg.br/web/atividade/materias/-/materia/votacao/2363507")

list_senadores = browser.find_elements_by_xpath(".//ul[@class='unstyled']")

print("Primeira coluna")
for lis in list_senadores[0].find_elements_by_css_selector("li"):
    print(lis.text)

print("Segunda coluna")
for lis in list_senadores[1].find_elements_by_css_selector("li"):
    print(lis.text)
    
23.08.2017 / 21:41