I have a df that I split columns off of for scaling and pca analysis. I did the analysis on the continuous numerical columns and now I am trying to put them back together, the 2 columns that were categorical and then the scaled data.
Both dfs have the same number of rows, there are no null values in this practice analysis. When I try and concat them in numerous different ways I get the correct number of rows, but so many null values that make no sense. Code is as follows:
Note - categorical_columns_df is categorical columns
Note - scaled_df is scaled data that corresponds directly to categorical columns data
dfs_to_concat[categorical_columns_df]
new_df = pd.concat(dfs_to_concat)
new_df
Time Baby_ID AGE_Under-1 AGE_Under-2 AGE_Under-3 AGE_Under-4 Input (X) Output (y) HR
Height Weight
0 3:00:00 AM 1.0 1.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN
1 4:00:00 AM 1.0 1.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN
2 5:00:00 AM 1.0 1.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN
3 6:00:00 AM 1.0 1.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN
4 7:00:00 AM 1.0 1.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN
......................................................
751 NaN NaN NaN NaN NaN NaN 0.604396 0.60 0.532895 0.642857 0.642857
2752 NaN NaN NaN NaN NaN NaN 0.615385 0.61 0.559211 0.642857 0.642857
2753 NaN NaN NaN NaN NaN NaN 0.626374 0.62 0.578947 0.642857 0.642857
2754 NaN NaN NaN NaN NaN NaN 0.615385 0.63 0.572368 0.642857 0.642857
2755 NaN NaN NaN NaN NaN NaN 0.604396 0.62 0.559211 0.642857 0.642857
What is going on here? What am I messing up in the code that I am getting nulls for half of the columns half the time and then nulls for the other half at the bottom? I have concated many dfs before an never run into this problem. Any insight is appreciated.