Is it possible to perform a line break from a specific JSon thread, transform into an array, and then dynamize? Why I'm asking this .. I'm developing a file mining bot and I came across a situation where some pages return only one file on that line and other pages on the same site can contain multiple information so my request is valid and I get extract the pdf, I need to do this division when there is such a case.
Multiples:
Codesnippet:Urltested: link
def parseHTML_JS(self, response):
idBuscaAnexo = json.loads(response.body)['d']['results'][0]['ID']
contente = json.loads(response.body)['d']['results'][0]['Texto']
data = json.loads(response.body)['d']['results'][0]['Data1']
categorias = response.meta['Categoria']
descricao = response.meta['Description']
titulo = response.meta['Titulo']
pdfs = json.loads(response.body)['d']['results'][0]['DocumentosAnexados'][0:][:-5]
url_pdfs = "http://www.bcb.gov.br/pre/normativos/busca/downloadNormativo.asp?arquivo=/Lists/Normativos/Attachments/"+str(idBuscaAnexo)+"/"+str(pdfs)
req = Request(url=url_pdfs, callback=self.parsePdf)
req.meta['Categoria'] = categorias
req.meta['Description'] = descricao
req.meta['Titulo'] = titulo
req.meta['Content'] = contente
req.meta['Data'] = data
yield req