Specific section break in JSON file with python

0

Is it possible to perform a line break from a specific JSon thread, transform into an array, and then dynamize? Why I'm asking this .. I'm developing a file mining bot and I came across a situation where some pages return only one file on that line and other pages on the same site can contain multiple information so my request is valid and I get extract the pdf, I need to do this division when there is such a case.

Return on Json:

Multiples:

Codesnippet:Urltested: link

def parseHTML_JS(self, response):
    idBuscaAnexo = json.loads(response.body)['d']['results'][0]['ID']
    contente = json.loads(response.body)['d']['results'][0]['Texto']
    data = json.loads(response.body)['d']['results'][0]['Data1']
    categorias = response.meta['Categoria']
    descricao = response.meta['Description']
    titulo = response.meta['Titulo']
    pdfs = json.loads(response.body)['d']['results'][0]['DocumentosAnexados'][0:][:-5]
    url_pdfs = "http://www.bcb.gov.br/pre/normativos/busca/downloadNormativo.asp?arquivo=/Lists/Normativos/Attachments/"+str(idBuscaAnexo)+"/"+str(pdfs)
    req = Request(url=url_pdfs, callback=self.parsePdf)
    req.meta['Categoria'] = categorias
    req.meta['Description'] = descricao
    req.meta['Titulo'] = titulo
    req.meta['Content'] = contente
    req.meta['Data'] = data
    yield req
    
asked by anonymous 29.05.2018 / 14:01

1 answer

1

In the string result use result = result.split('Circ_')[1] , this will eliminate the beginning, after result = result.split('_')[0] , this will get the number. in the first part you use as a separator the "Circ_", and with that takes the second part. It will be in the form "Number_Number_Caracter". Store this second part and now give a split in the first half, using underline as a separator, so you get the number that is in the first part.

    
30.05.2018 / 19:32