Make a random of a dataset with pandas

0

I'm trying to extract 15O lines from a 500-line dataset. So I would like to do it at random.

My Data

objeto,cor,label
cachorro,branco,animal
manga,laranja,fruta
calça,preta,roupa

My script

import pandas
import pandas as pd

df = pd.read_csv('produit_non_conforme.csv', sep = ',')
mails_random = df.sample(150) 

print(mails_random)

But the result is very strange, I do not have the complete line ...

         objeto                ...         label
277      uva                   ...         fruta
116      urso                  ...         animal
495      ...                   ...         ...

Would it be possible to have the full line?

    
asked by anonymous 23.08.2018 / 23:12

2 answers

1

You could just paste the values of the DataFrame and use it as you see fit.

Ex:

import pandas as pd

df = pd.read_csv('teste.csv', sep = ',')
mails_random = df.sample(2) 

for linha in mails_random.values:
    print(linha)  # ['coluna_1', 'coluna_2', 'coluna_3']

Repl.it with the working code

    
19.09.2018 / 16:39
1

Then. Suppose you want to select two random indexes in a 4-line data_frame.

you can proceed as follows:

import numpy as np
import pandas as pd


df = pd.DataFrame({'a':[1,2,3,4],'b':[1,2,3,4]})

df_index = list(df.index)

indexs = np.random.choice(df_index,2)

new_df = df.iloc[indexs]

I hope I have helped

    
19.09.2018 / 16:19