2021-04-30

Problem with iteration over PolynomialFeatures

I am trying to code a for loop that iterates over the degrees of the polynomial and returns the r2 score. Below is my code:

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

X_data=data[["Horsepower", "Peakrpm","Enginesize","Wheelbase","Compressionratio"]]
y_data = log_data

s = StandardScaler()

X_train, X_test, y_train, y_test = train_test_split(X_data, y_data,test_size=0.3,random_state=42)
X_train = s.fit_transform(X_train)
X_test = s.transform(X_test)

print(X_train.shape)
print(y_train.shape)
for i in range(3):
    poly = PolynomialFeatures(degree=i,include_bias=False, interaction_only=False)
     
    X_train=pd.DataFrame(poly.fit_transform(X_train),
    columns=poly.get_feature_names(input_features=data.columns))
    lr_poly=LinearRegression().fit(X_train,y_train)
    print(lr_poly.score(X_train,y_train))

and i am getting this error

could not broadcast input array from shape (143,5) into shape (143,0)

Should i do a reshape somewhere to X_train?

I understand that X_train gets different shape over the iterations because of the degree polynomial, but since it only gets more columns, why is this error keeps popping up?



from Recent Questions - Stack Overflow https://ift.tt/32ZTbFK
https://ift.tt/eA8V8J

No comments:

Post a Comment