Select rows by first digits, and average them by year

0

I have a CSV of the following type:

code   year   sales
2011   1970   5000
2011   1971   5200
2011   1972   ...
...   
2015   1970
2015   1971
2015   1972
...
3025
...
3026
...
3052
...

How can I select all the code lines starting with '20', or '30', and average sales year )?

Thank you so much !!

    
asked by anonymous 10.11.2018 / 02:10

1 answer

0
import csv, collections

soma = collections.defaultdict(float)
qtd = collections.defaultdict(int)

with open('arquivo.csv', newline='') as f:
    cf = csv.DictReader(f, delimiter='\t')
    for reg in cf:
        chave = (reg['code'], reg['year'])
        soma[chave] += float(reg['sales'])
        qtd[chave] += 1

for code, year in sorted(soma):
    print("A media para o code {} no ano {} é: {}".format(
         code, year, soma[code, year] / qtd[code, year]
    )
    
10.11.2018 / 17:05