I'm new to python and have been learning about pipelines from datacamp. I have been experimenting with some fifa data that has missing NaN values. I have tried to create a pipeline with the steps of imputing any missing data (replacing it with the mean) and then creating a logistic regression. I don't seem to get any errors in the output. However, when I print things such as print(x_train) and print(y_pred) the output still returns NaN values. Would that indicate that my Pipeline is not working and that the data was not correctly imputed as surely I should be seeing the mean values rather than NaN. Would appreciate if someone could answer the question in layman's terms as I am new to the topic.
fif_data=pd.read_csv("fifa_draft_1.csv")
df_Foot_Dummy=pd.get_dummies(fif_data, drop_first=True)
imp=SimpleImputer(missing_values=np.nan, strategy="mean")
logreg=LogisticRegression()
x=df_Foot_Dummy["passing"].values.reshape(-1,1)
y=df_Foot_Dummy["preferred_foot_Right"]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state=42)
steps=[("imputation", imp),("logistic_regression",logreg)]
pipe=Pipeline(steps)
pipe.fit(x_train,y_train)
y_pred=pipe.predict(x_test)
print(x_train)
print(y_pred)