So I have this code p.s. (sorry, cannot provide the dataframe due to confidentiality reasons) but maybe I'm missing something here
new_df = None
new_fn = None
prev_df = None
prev_fn = None
while 1:
msg = conn.recv()
if len(msg) > 1:
df = msg[0]
file_name = msg[1]
df['col2'] = ''
df['col2'] = df['col2'].apply(pd.to_numeric).astype('Int64')
if new_fn is None:
new_df = df
new_fn = file_name
new_df['col2'] = new_df['col1']
else:
prev_df = new_df
prev_fn = new_fn
new_df = df
new_fn = file_name
new_df = prev_df.merge(new_df, on='main', how='outer', suffixes=('_prev', '_new'))
new_df = new_df.assign(**{col: new_df[col].fillna(new_df[col.replace("_new", "_prev")])
for col in new_df.columns if "_new" in col})
Than the code gets to this block below, which works on random dataframes Iv'e tested with the same characteristics, but not when bound with the code above
np.where(new_df['col2_new'].isna(),
new_df['col2_new'].fillna(new_df['col1_new']), new_df['col2_new'])
For some reason the fillna
doesnt work and leaves col2_new
with many NA
values
print(new_df.isna().sum()) print(new_df.dtypes)
main 0 main object
col1_prev 158 col1_prev Int64
col2_prev 158 col2_prev Int64
col1_new 0 col1_new Int64
col2_new 158 col2_new Int64
dtype: int64 dtype: object
Iv'e also experienced some issues with isna/isnull
which seems to be the problem:
df = pd.DataFrame({'col1': str(randint(10, 100)), 'col2': randint(10, 100), 'col3': ""}, index=range(0, 3))
np.where(df['col3'].isna, df['col3'].fillna(df['col1']), df['col3'])
It was giving a correct output until just recently, but now it feels like something has broken:
print(df.count()) print(df.isna().sum()) print(df)
col1 3 col1 0 col1 col2 col3
col2 3 col2 0 0 33 38
col3 3 col3 0 1 33 38
dtype: int64 dtype: int64 2 33 38
Is it just me? am I doing something wrong? Is it the interpreter?
I appreciate any help, Thanks!