Find the most repeating value

3

I'm trying to analyze a shoe sales data, but I'm having a hard time creating a function to find the number that the customer bought the most in the previous year.

I have a table with this data:

Cód. Cliente    CPF     Nome                            Sexo        Tamanho
5879099     37513584800 LOJA                            MASCULINO   35
5879099     37513584800 LOJA                            MASCULINO   23
5879099     37513584800 LOJA                            MASCULINO   17
5879099     37513584800 LOJA                            MASCULINO   37
5879099     37513584800 LOJA                            MASCULINO   17
3353800     2613618809  DULIO JOSE DE SOUSA DAMICO      MASCULINO   35
3353800     2613618809  DULIO JOSE DE SOUSA DAMICO      MASCULINO   39
3112300     29953652805 ROSANA DA SILVA FAGUNDES        FEMININO    34
6116202     39285701884 ANA CAROLINA DE FARIAS FRANCISCO    FEMININO    31

The table is much more than just a few lines.

Well, what I need to know is what is the most repeated size per customer's CPF.

What number did he buy the most?

I could not find a way to do this, if someone has a light.

Thanks,

    
asked by anonymous 29.08.2018 / 19:46

1 answer

2

Iuri could use PIVOT TABLE (PivotTable) in Pandas

It would look something like this:

import pandas as pd
import numpy as np

df = pd.read_excel("SEU ARQUIVO")
table = pd.pivot_table(df,index=["CPF","Tamanho"],
               values=["Tamanho"],
               aggfunc=[np.count_nonzero],fill_value=0)

I used 'read_excel' just as an example, in your case just fill the dataframe with your data.

The 'index' parameter assembles the columns of the PivotTable, that is, the category columns that you want to use

and 'aggfunc' (Aggregate function) I am using the count

In this Link has interesting content on Pivot Table that can help more.

    
30.08.2018 / 14:16