Question
Hello Roger, to solve is simple, we just have to think of the logic of pandas chaining to perform correctly. If you want to sum (aggregate) the CPFs with values greater than 100 then: First you filter the values, then group the CPFs and finally, sum.
Example
We import the libraries and create the dataframe
We generated 300 randomly values from 1 to 500 for the data and from 1 to 4 for the CPF.
# importa as bibliotecas
import pandas as pd
import numpy as np
# cria o dataframe do exemplo
data = pd.Series(np.random.randint(1, 501, size=300), name='Dados')
cpf = pd.Series(np.random.randint(1, 5, size=300), name='CPF')
df = pd.concat([data, cpf], axis=1)
df Output
Dados CPF
0 424 4
1 416 1
2 231 1
3 423 1
4 36 1
5 14 4
6 317 1
7 4 4
8 34 3
9 98 1
10 464 4
...
Problem resolution
As already mentioned, to resolve it is enough: 1. filter the data; 2. group them by the desired column; 3. add up. Other operations can be performed as: count (count) or mean (mean).
df[df['Dados'] > 100].groupby(['CPF']).sum()
Output
CPF Dados
1 19023
2 17130
3 16998
4 16309
To solve any other future problem using pandas, think about how to create a pipeline of logical operations more suitable for this. It requires training, but you get the hang of it. Hug and good studies.