groupby - add according to criteria

0

I intend to add all the earnings of an official, for his cpf, but only when he earns 100, disregarding other values. I'm using this code, but it's not right:

aip.groupby(['CPF']).sum() where(aip['Valor']==100)

By taking the sum column, which sums everything up, and not just at the occurrence of 100.00, the worksheet template that is working is equal to that:

link

But I do not want everything to be deleted, but only in the case where the commission is maximum, that is, 100.

    
asked by anonymous 11.11.2018 / 14:13

1 answer

1

Question

Hello Roger, to solve is simple, we just have to think of the logic of pandas chaining to perform correctly. If you want to sum (aggregate) the CPFs with values greater than 100 then: First you filter the values, then group the CPFs and finally, sum.

Example

We import the libraries and create the dataframe

We generated 300 randomly values from 1 to 500 for the data and from 1 to 4 for the CPF.

# importa as bibliotecas
import pandas as pd
import numpy as np

# cria o dataframe do exemplo
data = pd.Series(np.random.randint(1, 501, size=300), name='Dados')
cpf = pd.Series(np.random.randint(1, 5, size=300), name='CPF')
df = pd.concat([data, cpf], axis=1)

df Output

    Dados   CPF
0   424     4
1   416     1
2   231     1
3   423     1
4   36      1
5   14      4
6   317     1
7   4       4
8   34      3
9   98      1
10  464     4
...

Problem resolution

As already mentioned, to resolve it is enough: 1. filter the data; 2. group them by the desired column; 3. add up. Other operations can be performed as: count (count) or mean (mean).

df[df['Dados'] > 100].groupby(['CPF']).sum()

Output

CPF Dados
1   19023
2   17130
3   16998
4   16309

To solve any other future problem using pandas, think about how to create a pipeline of logical operations more suitable for this. It requires training, but you get the hang of it. Hug and good studies.

    
26.11.2018 / 19:12