importing data extension ".data"

1

My Python code is loading an internet dataset, but it does not recognize the columnar number of the data.

Python code:

import pandas as pd

##Importando dados
data = pd.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data', header=1)

print(len(data.columns)) 
2

Do I need to set any other parameters? I tried delimiter = "\t" but got the same result.

    
asked by anonymous 20.05.2018 / 17:19

1 answer

1

The separator of the first 8 fields of the file is the space character and not TAB ( \t ).

To read the file correctly, add the parameter delim_whitespace=True and change the parameter header to 0 (because the file has no header).

Example:

data = pd.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data', delim_whitespace=True, header=0)

data
Out[15]:
     18.0  8  307.0  130.0   3504.  12.0  70  1          chevrolet chevelle malibu
0    15.0  8  350.0  165.0  3693.0  11.5  70  1                  buick skylark 320
1    18.0  8  318.0  150.0  3436.0  11.0  70  1                 plymouth satellite
2    16.0  8  304.0  150.0  3433.0  12.0  70  1                      amc rebel sst
3    17.0  8  302.0  140.0  3449.0  10.5  70  1                        ford torino
4    15.0  8  429.0  198.0  4341.0  10.0  70  1                   ford galaxie 500
5    14.0  8  454.0  220.0  4354.0   9.0  70  1                   chevrolet impala
...

print(len(data.columns))
9
    
20.05.2018 / 19:08