Grouping and aggregating data

2

I have the following file in CSV ( 12 Millions of records):

UF Municipio   Cod  NIS         Valor Data  
MA IMPERATRIZ  803  16361947271 45.00 01/01/2011  
MA IMPERATRIZ  803  74629273937 15.00 01/01/2011  
BA RUY BARBOSA 3845 16481166579 50.00 01/02/2011  
BA RUY BARBOSA 3845 16481166579 50.00 01/03/2011  
MG IPATINGA    653  73639474937 10.00 01/03/2011  
MG IPATINGA    653  83733638376 20.00 01/03/2011  
MG IPATINGA    653  52648747648 25.00 01/03/2011  
...  

I need to group the data by Date , UF and City > by calculating the amount of NIS and adding the values. That is, for each group of Date , UF and Municipality , I need to count the amount of NIS and add the values. For the above data, the desired result would be:

Data       UF Municipio   Quant. Valor  
01/01/2011 MA IMPERATRIZ  002    60.00  
01/02/2011 BA RUY BARBOSA 001    50.00  
01/03/2011 BA RUY BARBOSA 001    50.00  
01/03/2011 MG IPATINGA    003    55.00  
...  

This result should generate a new CSV file. To add or count values I use the following codes (which work):

Conta_NIS = csvPanda.groupby(['Data', 'UF', 'Municipio']).NIS.count()  
Soma_Valor = csvPanda.groupby(['Data', 'UF', 'Municipio']).Valor.sum() 

But how do you include the two aggregations ( count and sum ) in the same output to export to a new CSV

Very grateful to all!

    
asked by anonymous 18.05.2016 / 00:47

1 answer

2

I was able to find the solution using the Pandas "groupby". I created two distinct groupings, but for the same fields, one by value (sum of the Value column) and one by counting the NIS number.

BF_Value = csvPanda.groupby (['Data', 'UF', 'Municipality')) Valor.sum ()
BF_NIS = csvPanda.groupby (['Data', 'UF', 'Municipality')) NIS.nunique ()

Then I created two Data Frames with this data:

DF_Value = pandas.DataFrame (BF_Value) DF_NIS = pandas.DataFrame (BF_NIS)

Finally I concatenated them into a single dataset:

frames = [DF_Value, DF_NIS] DF_BF_Page = pandas.concat (frames, axis = 1)

It worked. Very grateful to all!

    
24.05.2016 / 15:31