Remove duplicate dates by adding values

Question

Navigation

#1 by (4 votes)

2

I need to remove the duplicate dates from the dataframe and add the values corresponding to those dates.

I found an answer in the NA stack that is close to the one I need, but I could not shape it for my need:

df.groupby('data', group_keys=False).apply(lambda x: x.loc[x.valor.idxmax()])

But instead of grouping by date and keeping the value higher, I need to keep the sum of the values, not just the larger value.

python python-3.x pandas

asked by anonymous 21.06.2017 / 21:54

1 answer

Retrieve text from li and pass to PHP how do I remove some numbers after the comma in PHP

score 4 · Accepted Answer

I have been able to solve the problem, so I will respond to help anyone who has to face the same problem in the future.

The following is an explanation of the code:

Generating the dataframe from an existing dictionary:

swap_df = pd.DataFrame(swap_montado, columns=['Portfolio', 'Data posicao', 'Valor'])

Grouping the data from the date and adding the Value values that correspond to the DUPLICATED dates:

swap_df = swap_df.groupby('Data posicao').agg({
            'Portfolio': 'first',
            'Valor': sum
        })

Rearranging the order of dataframe columns:

swap_df = swap_df[['Valor', 'Portfolio']]

Resolution found at: link