Amount in a dataset [closed]

-1

I own this dataset

Iwouldliketocounttheinstancesof'DATA_CRIA'inrelationtothisotherdataset

To do this I did this, but it did not work:

lista = [] for ano in anos: lista.append(oni['DATA_CRIA'].count() print(lista)

    
asked by anonymous 09.08.2018 / 18:35

1 answer

0

I assumed you were using Pandas and DataFrame.

From this:

import pandas as pd

anos = [2008, 2009, 2010]

d = {'DISTANCIA': [12.3, 33.3, 11.1, 43.4], 'DATA_CRIA': [2008, 2008, 2009, 1909]}
df = pd.DataFrame(data=d)
>>> print df
   DATA_CRIA  DISTANCIA
0       2008       12.3
1       2008       33.3
2       2009       11.1
3       1909       43.4

We can create a function that counts the occurrence of a given year in a dataframe:

def conta_ocorrencias(df, ano):
    ocorrencias = 0
    for i in range(len(df)):
        if df.iloc[i]['DATA_CRIA'] == ano:
            ocorrencias += 1
    return ocorrencias

And so we do:

lista = []
for ano in anos:
    lista.append(conta_ocorrencias(df, ano))
>>> print lista
[2, 1, 0]

But this is not very efficient because it goes through df once every year, so we can change the approach.

#crio um dicionario com os anos como chave para contar as ocorrencias
dic_ano = {}
for ano in anos:
    dic_ano.update({ano:0})
>>> print dic_ano
{2008: 0, 2009: 0, 2010: 0}

And here I run the df just once:

for i in range(len(df)):
    for ano in anos:
        if df.iloc[i]['DATA_CRIA'] == ano:
            dic_ano[ano] = dic_ano[ano]+1
>>> print dic_ano
{2008: 2, 2009: 1, 2010: 0}
    
09.08.2018 / 19:05