Perhaps you can solve your specific problem with
X_train_Specfeatures.columns = X_train_features.columns
Background
As mentioned in the comments, that usually happens when the column labels are not the same for both dfs.
Take these two dfs
df = pd.DataFrame([[0, 1], [2, 3]])
df2 = df.copy()
If you append
(or concat
, all the same), you will get a 4x2 df because the column labels are exactly the same.
# df_out = df.append(df2, ignore_index=True)
df_out = pd.concat([df, df2])
print(df_out)
0 1
0 0 1
1 2 3
2 0 1
3 2 3
But if you change the column names in one df you will get a 4x4 df, because pandas tries to align the column labels.
df2.columns = ['0', '1']
# df_out = df.append(df2, ignore_index=True)
df_out = pd.concat([df, df2], ignore_index=True)
print(df_out)
0 1 0 1
0 0.0 1.0 NaN NaN
1 2.0 3.0 NaN NaN
2 NaN NaN 0.0 1.0
3 NaN NaN 2.0 3.0
Notice even though the column names are printed the same, they are actually different values (in one df 0
is an integer and in the other it is a string). So pandas interprets them as different columns, and since the second df has no values for the first column, then it fills with NaN
.
You can read more in this question about Pandas Merging 101