Read blocks of XML tags based on a search

1

I have a folder where the insert logs are stored in the database. The log files follow this structure:

             <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569692</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Não encontrado</msg>
                <sucesso xsi:type="xsd:boolean">false</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569693</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569694</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569695</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>

Currently, I know that a record was entered when the sucesso tag comes with the value true , and I know it failed when the value is false .

What I want to do is through a Python program, read these files and extract <item></item> blocks that contain success = false

I've tried the code below, but it only extracts the line false

search = 'false'

def check():
    datafile = open('C:\TESTE\LCL_20170420_30052.67.XML')
    for line in datafile:
        if search in line:
            found = True
            print(line)
            break
        else:
            found = False
    return found


check()
    
asked by anonymous 20.04.2017 / 14:32

1 answer

0

You can use the BeautifulSoup library to browse through the tags in your XML file.

You could also use other libraries, such as xml.etree.ElementTree , minidom or lxml , for example.

Code:

s = '''
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569692</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Não encontrado</msg>
    <sucesso xsi:type="xsd:boolean">false</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569693</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569694</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569695</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(s, 'lxml')

item_tags = soup.find_all('item')

for item in item_tags:
    if item.sucesso.text == 'true':
        print(item)
        print('='*5)

Output:

<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569692</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====
<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569693</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====
<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569694</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====
<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569695</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====

Other Examples:

20.04.2017 / 16:18