Extract and print specific values from an xml using Python

Question

Extract and print specific values from an xml using Python

Navigation

#1 by (1 votes)
#2 by (1 votes)

1

Hello:)

I'm trying to capture data from an xml to use in a search using python, however my algorithm is only returned the data of the last tags, eg: xml is in link and inside it, there are three tags containing the data of the parliamentarians, the tags have the name of" Parliamentarian ", when running the code , the same only returns the data of the last parliamentary and not the data of the 3, it follows code used:

import requests
import xmltodict
import dicttoxml
from xml.etree import ElementTree as elements

URL = "http://legis.senado.gov.br/dadosabertos/senador/lista/atual?uf=sp"

dados = requests.get(url=URL)
dadosx = xmltodict.parse(dados.content)
dadosxml = dicttoxml.dicttoxml(dadosx)

root = elements.fromstring(dadosxml)
levels = root.findall('.//IdentificacaoParlamentar')
for level in levels:
name = level.find('NomeParlamentar').text
code = level.find('CodigoParlamentar').text
print name, code

This code is returning me only:

Marta Suplicy 5000

Could someone tell me where I'm wrong?

Thanks for the attention :)

python xml

asked by anonymous 25.04.2018 / 03:19

2 answers

1

There is missing indentation in print so iterated values are printed

for level in levels:
    name = level.find('NomeParlamentar').text
    code = level.find('CodigoParlamentar').text
    print(name, code)

OUTPUT:

Airton Sandoval 5140
José Serra 90
Marta Suplicy 5000

25.04.2018 / 03:29

Table values being truncated HTML view of the field

score 1 · Accepted Answer

Python code blocks "work" according to your indent , your code is only returning 1 name as it passes through for and only "escreve" the last item in the variable name, your code should be:

import requests
import xmltodict
import dicttoxml
from xml.etree import ElementTree as elements

URL = "http://legis.senado.gov.br/dadosabertos/senador/lista/atual?uf=sp"

dados = requests.get(url=URL)
dadosx = xmltodict.parse(dados.content)
dadosxml = dicttoxml.dicttoxml(dadosx)

root = elements.fromstring(dadosxml)
levels = root.findall('.//IdentificacaoParlamentar')
for level in levels:
  name = level.find('NomeParlamentar').text
  code = level.find('CodigoParlamentar').text
  print(name)
  print(code)

Output:

Airton Sandoval 5140

José Serra 90

Marta Suplicy 5000

On this site there is a little talk about operation of the indentation