I have a dataframe with two rows
df = pd.DataFrame({'group' : ['c'] * 2,
'num_column': range(2),
'num_col_2': range(2),
'seq_col': [[1,2,3,4,5]] * 2,
'seq_col_2': [[1,2,3,4,5]] * 2,
'grp_count': [2]*2})
With 8 nulls, it looks like this:
df = df.append(pd.DataFrame({'group': group}, index=[0] * size))
group grp_count num_col_2 num_column seq_col seq_col_2
0 c 2.0 0.0 0.0 [1, 2, 3, 4, 5] [1, 2, 3, 4, 5]
1 c 2.0 1.0 1.0 [1, 2, 3, 4, 5] [1, 2, 3, 4, 5]
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
0 c NaN NaN NaN NaN NaN
What I want
Replace NaN values in sequences columns (seq_col, seq_col_2, seq_col_3 etc) with a list of my own.
Note: .
- In this data there are 2 sequence column only but could be many more.
- Cannot replace previous lists already in the columns, ONLY NaNs
I could not find solutions that replaces NaN with a user provided list value from a dictionary suppose.
Pseudo Code:
for each key, value in dict,
for each column in df
if column matches key in dict
# here matches means the 'seq_col_n' key of dict matched the df
# column named 'seq_col_n'
replace NaN with value in seq_col_n (which is a list of numbers)
I tried this code below, it works for the first column you pass then for the second column it doesn't. Which is weird.
df.loc[df['seq_col'].isnull(),['seq_col']] = df.loc[df['seq_col'].isnull(),'seq_col'].apply(lambda m: fill_values['seq_col'])
The above works but then try again on seq_col_2, it will give weird results.
Expected Output: Given param input:
my_dict = {seq_col: [1,2,3], seq_col_2: [6,7,8]}
# after executing the code from pseudo code given, it should look like
group grp_count num_col_2 num_column seq_col seq_col_2
0 c 2.0 0.0 0.0 [1, 2, 3, 4, 5] [1, 2, 3, 4, 5]
1 c 2.0 1.0 1.0 [1, 2, 3, 4, 5] [1, 2, 3, 4, 5]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]
0 c NaN NaN NaN [1,2,3] [6,7,8]