I have the following file in CSV ( 12 Millions of records):
UF Municipio Cod NIS Valor Data
MA IMPERATRIZ 803 16361947271 45.00 01/01/2011
MA IMPERATRIZ 803 74629273937 15.00 01/01/2011
BA RUY BARBOSA 3845 16481166579 50.00 01/02/2011
BA RUY BARBOSA 3845 16481166579 50.00 01/03/2011
MG IPATINGA 653 73639474937 10.00 01/03/2011
MG IPATINGA 653 83733638376 20.00 01/03/2011
MG IPATINGA 653 52648747648 25.00 01/03/2011
...
I need to group the data by Date , UF and City > by calculating the amount of NIS and adding the values. That is, for each group of Date , UF and Municipality , I need to count the amount of NIS and add the values. For the above data, the desired result would be:
Data UF Municipio Quant. Valor
01/01/2011 MA IMPERATRIZ 002 60.00
01/02/2011 BA RUY BARBOSA 001 50.00
01/03/2011 BA RUY BARBOSA 001 50.00
01/03/2011 MG IPATINGA 003 55.00
...
This result should generate a new CSV file. To add or count values I use the following codes (which work):
Conta_NIS = csvPanda.groupby(['Data', 'UF', 'Municipio']).NIS.count()
Soma_Valor = csvPanda.groupby(['Data', 'UF', 'Municipio']).Valor.sum()
But how do you include the two aggregations ( count
and sum
) in the same output to export to a new CSV
Very grateful to all!