Linear regression application

Question

Linear regression application

Navigation

#1 by (2 votes)

0

I have two lists

print(lista)
[970084.4148727012, 983104.7719906792, 996164.0, 1006426.5111488493, 1016687.0370821969, 1026941.5758164332, 1037185.9604590479, 1047415.8544247652, 1057626.746645888, 1067813.94679318, 1077972.5805253708, 1088097.584787312, 1098183.7031788095, 1147832.9385862947, 1195602.90322828, 1281768.5077875573]

print(new_list)
[3161, 3185, 3164, 3152, 3154, 3146, 3144, 3174, 0, 0, 0, 0, 0, 0, 0, 0]

I want to apply Linear Regression to predict values that are 0 in new_list , so I selected only the first 8 items:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X = lista[:8]
y = new_list[:8]

I've separated the data for training and testing

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=2)

And I applied linear regression:

regr = LinearRegression() 
regr.fit(X_train, y_train)

But it gave an error:

ValueError: Expected 2D array, got 1D array instead: array = [970084.4148727 983104.77199068 996164. 1006426.51114885 1016687.0370822 1026941.57581643 1037185.96045905 1047415.85442477]. Reshape your data either using array.reshape (-1, 1) if your data has a single feature or array.reshape (1, -1) if it contains a single sample.

What should I do?

python

asked by anonymous 09.08.2018 / 22:02

1 answer

Concatenating multiple lists with Python How to tell if Modal is scrolling active?

score 2 · Accepted Answer

The sklearn assumes that your X data is a list of lists, otherwise it can not distinguish between a dataset of, for example, 8 features and 1 example and 1 dataset of 1 feature and 8 examples.

To solve this, you can turn your list into a list of lists:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
lista = [[elemento] for elemento in lista]
X = lista[:8]
y = new_list[:8]

...

Or use numpy with reshape as suggested by error message ( Reshape your data using array.reshape(-1, 1) if your data has a single feature ):

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X = lista[:8]
X = np.array(X).reshape(-1, 1)
y = new_list[:8]

...