Back position in JSON format data or list with Python

1

I'm working with data similar to the structure below:

{"Id":1,
"Data_inscricao":"2017-01-01",
"Texto":"Loremipsum",
"Numeracao":26,
"Tempo":"25s"}, 
{"Id":3,
"Data_inscricao":"2010-05-02",
"Texto":"LoremipsumLorem",
"Numeracao":656,
"Tempo":9},....

I have in hand the data "656" which refers to "Numbering". I need to return 2 position of my .get ("Numbering") to get the data "2010-05-02", that is, to use .get ("Data_inscricao") but with reference to "Numbering": 656

How do I do this in JSON or list variable? Current code below:

numeracao = '656'

#A URL é privada, não vou conseguir mostrar o conteúdo
html = urlopen("https://www.teste.com.br")

#Retornando um volume muito grande de dados, não são apenas 2 blocos de registros.
bsObj = BeautifulSoup(html)

informacoes = bsObj.findAll(id="Resultados")
print(informacoes)

    #Resultado do print() - BEGIN

    [<input id=&quot;Resultados&quot; type=&quot;hidden&quot; value=&quot;{

    &quot;result&quot;:true,&quot;message&quot;:&quot;ok&quot;,&quot;Contador&quot;:2282,&quot;Dados&quot;:
    [
    {&quot;Id&quot;:1,
    &quot;Data_inscricao&quot;:&quot;2017-01-01&quot;,
    &quot;Texto&quot;:&quot;Loremipsum&quot;,
    &quot;Numeracao&quot;:26,
    &quot;Tempo&quot;:&quot;25s&quot;}, 
    {&quot;Id&quot;:3,
    &quot;Data_inscricao&quot;:&quot;2010-05-02&quot;,
    &quot;Texto&quot;:&quot;LoremipsumLorem&quot;,
    &quot;Numeracao&quot;:656,
    &quot;Tempo&quot;:9}
    ]

    }&quot;/>]
    #Resultado do print() - END

informacoes = informacoes.replace('&quot;', '\"')
print(type(informacoes))

    #Resultado do print() - BEGIN
    <class 'str'>
    #Resultado do print() - END

print(informacoes)

    #Resultado do print() - BEGIN

    [<input id="Resultados" type="hidden" value="{

    "result":true,"message":"ok","Contador":2282,"Dados":
    [
    {"Id":1,
    "Data_inscricao":"2017-01-01",
    "Texto":"Loremipsum",
    "Numeracao":26,
    "Tempo":"25s"}, 
    {"Id":3,
    "Data_inscricao":"2010-05-02",
    "Texto":"LoremipsumLorem",
    "Numeracao":656,
    "Tempo":9}
    ]

    }"/>]

    #Resultado do print() - END

regex = re.compile('(?:\"Dados\":\[)(.*?)(?:[]}"/>]])')

informacoes = re.findall(regex, informacoes)
print(type(informacoes))

    #Resultado do print() - BEGIN
    <class 'list'>
    #Resultado do print() - END

#Imprime conteúdo, considerando como lista
for dados in informacoes:
    print(type(dados))
        #Resultado do print() - BEGIN
        <class 'str'>
        #Resultado do print() - END

    print(dados)
        #Resultado do print() - BEGIN

        {"Id":1,
        "Data_inscricao":"2017-01-01",
        "Texto":"Loremipsum",
        "Numeracao":26,
        "Tempo":"25s"}, 
        {"Id":3,
        "Data_inscricao":"2010-05-02",
        "Texto":"LoremipsumLorem",
        "Numeracao":656,
        "Tempo":9

        #Resultado do print() - END
    #No print acima, realmente está faltando a } no final, provavelmente é por causa da regex
    
asked by anonymous 07.06.2017 / 03:18

4 answers

1

I think they have already given good answers here, but I will leave my interpretation, even to do the analysis requested in the comments. The first thing I notice is that the example that is given by Json is not actually valid, a json is, in terms of format, very similar to the python's (or list of) dictionaries. Thus, I have adapted to the most probable format:
( tl; dr ).

[  
   {  
      "Id":1,
      "Data_inscricao":"2017-01-01",
      "Texto":"Loremipsum",
      "Numeracao":26,
      "Tempo":"25s"
   },
   {  
      "Id":3,
      "Data_inscricao":"2010-05-02",
      "Texto":"LoremipsumLorem",
      "Numeracao":656,
      "Tempo":9
   }
]

If you get the above content, put in a text file with the name "j1.txt", enter a python console and execute the commands below:

with open('/path/j1.txt') as f:
    data = json.loads(f.read())
print (data) 

What you'll get is exactly an object of type list of python, and print will be exactly like a presentation of txt. With this correct and "loaded" format for python, we can do what you want:

number = 656
for d in data:
    for value in d.values():
        if value==number:
            print ('A data de inscrição é: ', d['Data_inscricao'])

 A data de inscrição é:  2010-05-02

As I said a Json or a dictionary, it is not 'navigable' through a pointer, but rather through its keys / values, so what I did was navigate the object list (this does allow pointer navigation), find out which object has the value you are looking for (see no I did not care if the key is Numeracao or not, but I could do it, incidentally, I do it then), and finally get the value of data_inscricao .

If you want to restringer only for the Numeração key, you could do so:

# Restringindo para que a busca se limite à chave 'Numeração'
for d in data:
    for key  in d.keys():
        if key=='Numeracao':
            if d[key]==number: 
                print ('A data de inscrição é: ', d['Data_inscricao'])
A data de inscrição é:  2010-05-02

Note that in both cases, if the value 656 appears% of% times (the first restricting to the key n and the second not taking the key into consideration.), the date of enrollment will be displayed Numeracao times .

Final Consideration
You can create a strategy to be able to 'go back', one or two% of positions in the object inside the json (see that you would not be returning in json, but in the object within it, which would be a dictionary) this would be a true masochism, a way would be: Identify the dictionary where the value is sought, put the keys in a n array and the values in another n , identify the position (index) of the searched value in mkeys , return the desired% of positions to mvalues (only to get the name of the key) and access the value in mvalues in the same position. Except as a mere exercise, it does not seem sensible.

See working code on repl.it.

    
20.06.2017 / 04:45
2

I believe your difficulty is related to how to transform a string into a dictionary. Since you convert JSON to string to use replace , you need to make python come back to you as dict to make your query. To achieve this you can use the built-in json ( link ).

import json

informacoes = '{"resultado":true, "mensagem":"ok", "Contador":2144,\
                "Dados":[{"Id":1, "Data_inscricao":"2017-01-01",\
                          "Texto":"Loremipsum", "Numeracao":26, "Tempo":"25s"},\
                         {"Id":3, "Data_inscricao":"2010-05-02",\
                          "Texto":"LoremipsumLorem", "Numeracao":656, "Tempo":"96s"}]}'

# Diz ao python que sua string deve ser lida como JSON
data = json.loads(informacoes)

# Se você der um print(type(data)) verá que a str passará a ser tratada como dict

Then just search the dictionary (s) for the information you want. There are several ways to do this, and below I just exemplify one of them.

numeracao = 656

for dictio in data["Dados"]:
    if dictio["Numeracao"] == numeracao:
        print(dictio["Data_inscricao"])

NOTE: Do not look for its numbering as str, because in JSON it is int.

    
09.06.2017 / 20:50
1

Hi, I'm a Java programmer, but I think this line of reasoning works for python ...

1 - You need to transform this JSON into a python data structure, ie an array of objects;

2 - Having an array of objects in hand you need to filter the items that have the value 656 in the "Numbering" attribute, maybe there is already a library that abstracts a lot for you, otherwise you will have to go through all the items, checking the value of the "Numbering" property of each to return 1 object that will be what you are looking for;

3 - Having the item (object) found you will access the property "Data_inscricao" from within it;

    
07.06.2017 / 03:27
1

It seems that your main question is not in converting data to JSON, but in extracting data from HTML with BeautifulSoup.

I'm not that familiar with BeautifulSoup, but from what I saw in documentation you need to use method soup.find() because the soup.findAll() method returns a list of elements, whereas soup.find() returns an element or None.

Once you have found the element, just get the attribute you want directly in python with __getitem__ (example elemento['atributo'] ).

html = urlopen("https://www.teste.com.br")
soup = BeautifulSoup(html)

# Pega o elemento que tu quer
info = soup.find(id="Resultados")

if info is None:
    # Nenhum elemento encontrado

# pega apenas o atributo que tu quer
json_str = info.get('value')

if json_str is None:
    # Elemento não possui atributo 'value'

# converte para json
json_data = json.loads(json_str)

Now that you already have your data loaded as JSON it's time to choose how the required data will be extracted.

If you only want to search the data for the Numeracao field and need to do more than 1 search, you can create a dict with the index being the Numeracao field, so the search in dict is fast (O (1)).

Ex:

import json

# json_data atribuído no código anterior

dados = { data['Numeracao']: data for data in json_data['Dados'] }

# print(dados)
"""
dados = {
    656: {
        'Tempo': '96s', 
        'Data_inscricao': '2010-05-02', 
        'Id': 3, 
        'Numeracao': 656, 
        'Texto': 'LoremipsumLorem'
    }, 
    26: {
        'Tempo': '25s', 
        'Data_inscricao': '2017-01-01', 
        'Id': 1, 
        'Numeracao': 26, 
        'Texto': 'Loremipsum'
    }
}
"""

# Agora ficou simples e rápido procurar por numeração
numeracao = 656

# retorna o item com chave 656 ou None se o item não existir
item = dados.get(numeracao) 

if item:
    print('Dados encontrados ---> Data_inscricao:', item['Data_inscricao'])
else:
    print('Dados não encontrados')

But if you search only once and discard the data, you can iterate your data list and just pick up the item you need (O (n)).

#!/usr/bin/python3

import json

# json_data atribuído no código anterior

# Numeração desejada
numeracao = 656

# Itera json_data['Dados'] e filtra apenas o item que tiver numeração (retorna um generator)
dados = (x for x in json_data['Dados'] if x['Numeracao'] == numeracao)

# Retorna o item com a numeração desejada ou None caso ele não exista
item = next(dados, None)

if item:
    print('Dados encontrados ---> Data_inscricao:', item['Data_inscricao'])
else:
    print('Dados não encontrados')
    
09.06.2017 / 23:12