Doubt how to shave data like Python using BeautifulSoup Table

0

I'm a beginner and I'm trying to get a table from the transparency portal site, but I'm not able to get the tag with no data at all. When I open the developer tool I visualize the data that I want the states and the transfer value to be, but when I give a ctrl + u to the caught the data does not appear after the tag , it may be confusing but have the images below.

When I look at the tag in python it appears with nothing inside like when I look at the page code giving a ctrl + u , what am I doing wrong?

importrequestsfrombs4importBeautifulSouppage=requests.get("http://www.portaltransparencia.gov.br/funcoes/12- 
educacao?ano=2018")
soup = BeautifulSoup(page.content, 'html.parser')
p = soup.find('table', class_='tabelaPrimeiroNivel')
forecast_items = p.find_all('tbody')
print(forecast_items)
    
asked by anonymous 10.11.2018 / 10:18

1 answer

1

Your problem is that data is not on the page . When you access the page, a blank skeleton is loaded from where the data should be, and then the page runs javascript code that makes a separate request to the server and then creates those elements dynamically , then the page was loaded.

Since BeautifulSoup does not execute javascript , you only have access to the page that is still empty, so you can not get this data with it.

You can check what I've said by opening the developer tool and loading the page with the "Network" (network) tab selected - you'll see that the page makes several requests from where the other dynamic data.

There are two possible solutions:

  • Using Selenium - is a python library that allows you to control a real browser, such as firefox or chrome. As true browsers run javascript, you'll be able to get the data that way, but this solution is less efficient as you need to load a heavy browser and various page elements that do not interest you.

  • Read the page, examine the code and the requests it makes via javascript, and then manually write python code that mimics those requests. This method usually gives more work, however, the result is more efficient, since it will have a code capable of doing only what it takes to fetch the data you want.

  • Fortunately, the transparency portal has an API - an interface for programmers to retrieve the data without having to parse the pages. The explanation of use is in this link link

    An example:

    r = requests.get('http://www.portaltransparencia.gov.br/funcoes/12/mapa', 
        params={'ano': '2018'})
    print(r.json())
    

    Result:

    [{'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'ACRE',
      'siglaUF': 'AC',
      'valor': 271382820.59},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'ALAGOAS',
      'siglaUF': 'AL',
      'valor': 762900876.36},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'AMAPÁ',
      'siglaUF': 'AP',
      'valor': 202949699.19},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'AMAZONAS',
      'siglaUF': 'AM',
      'valor': 704229532.02},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'BAHIA',
      'siglaUF': 'BA',
      'valor': 1800232448.53},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'CEARÁ',
      'siglaUF': 'CE',
      'valor': 1317203323.08},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'DISTRITO FEDERAL',
      'siglaUF': 'DF',
      'valor': 1702869722.04},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'ESPÍRITO SANTO',
      'siglaUF': 'ES',
      'valor': 1005278642.49},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'GOIÁS',
      'siglaUF': 'GO',
      'valor': 1300024908.65},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'MARANHÃO',
      'siglaUF': 'MA',
      'valor': 904528606.79},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'MATO GROSSO',
      'siglaUF': 'MT',
      'valor': 848977509.3},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'MATO GROSSO DO SUL',
      'siglaUF': 'MS',
      'valor': 812220959.61},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'MINAS GERAIS',
      'siglaUF': 'MG',
      'valor': 5612411096.05},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'PARANÁ',
      'siglaUF': 'PR',
      'valor': 1913617246.26},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'PARAÍBA',
      'siglaUF': 'PB',
      'valor': 1626800821.69},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'PARÁ',
      'siglaUF': 'PA',
      'valor': 1502290653.09},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'PERNAMBUCO',
      'siglaUF': 'PE',
      'valor': 1793890169.14},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'PIAUÍ',
      'siglaUF': 'PI',
      'valor': 752510959.88},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'RIO DE JANEIRO',
      'siglaUF': 'RJ',
      'valor': 5077770452.72},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'RIO GRANDE DO NORTE',
      'siglaUF': 'RN',
      'valor': 1417979764.75},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'RIO GRANDE DO SUL',
      'siglaUF': 'RS',
      'valor': 4444340585.5},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'RONDÔNIA',
      'siglaUF': 'RO',
      'valor': 334773348.77},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'RORAIMA',
      'siglaUF': 'RR',
      'valor': 226714164.22},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'SANTA CATARINA',
      'siglaUF': 'SC',
      'valor': 1531789135.42},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'SERGIPE',
      'siglaUF': 'SE',
      'valor': 622536740.58},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'SÃO PAULO',
      'siglaUF': 'SP',
      'valor': 1981995537.81},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': 'TOCANTINS',
      'siglaUF': 'TO',
      'valor': 424378306.98},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': '',
      'siglaUF': 'Nacional',
      'valor': 36308921677.23},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': '',
      'siglaUF': 'Centro-Oeste',
      'valor': 0.0},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': '',
      'siglaUF': 'Sul',
      'valor': 175660334.04},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': '',
      'siglaUF': 'Nordeste',
      'valor': 248671827.99},
     {'codigoIBGE': '',
      'nomeMunicipio': '',
      'nomeUF': '',
      'siglaUF': 'Sudeste',
      'valor': 0.0}]
    
        
    10.11.2018 / 16:58