0

I am trying to concatinate an older df(main_df) with a newer (ellicom_df) and then drop all the rows where i have the same manufacturer but a different date from the one making the update. However the code drops far too many lines than it should. In the example below old the main_df has 6269 lines the new (ellicom_df) has 7126 , the updated has (correctly) 13395 however after the drop i only get 3472 lines in the updated main_df, but i should have 7126 . I know the problem is either in the drop or in the date but i cant figure out where

print(ellicom_df.shape)
print(main_df.shape)

(7126, 4) (6269, 8)

date_=pd.Timestamp('now').date()
ellicom_df['UPDATED'] = date_
ellicom_df['MANUFACTURER']='ELLICOM'

main_df = pd.concat([main_df, ellicom_df], sort=False)
main_df.shape

(13395, 9)

 main_df.drop(main_df.loc[(main_df['MANUFACTURER']=='ELLICOM') & 
   (main_df['UPDATED']!=date_)].index, inplace=True)

main_df.shape

(3472, 9)

here is an example of the df's main_df: enter image description here

enter image description here

Priniotis
  • 45
  • 6
  • It might be an index issue, try reindexing before drop? i.e. `main_df.reset_index()` – FAB Nov 28 '22 at 10:00

1 Answers1

1

Why not simply filter this way:

main_df = main_df[(main_df['MANUFACTURER']!='ELLICOM') | (main_df['UPDATED']==date_)]
gtomer
  • 5,643
  • 1
  • 10
  • 21
  • 1
    change `&&` -> `&` ;) – mozway Nov 28 '22 at 10:02
  • i have multiple 'MANYFACTURERS' that i have to update in the main_df the 'ELLICOM; is just one of them ... this way deletes them all i am afraid. – Priniotis Nov 28 '22 at 10:23
  • There is a quick solution for that as well. Open a new question for that – gtomer Nov 28 '22 at 12:37
  • 1
    Since you are inverting the logical expression for dropping rows to get rows to keep, shouldn't it be `|` rather than `&`? Thus, `main_df = main_df[(main_df['MANUFACTURER']!='ELLICOM') | (main_df['UPDATED']==date_)]` – DarrylG Nov 28 '22 at 13:06
  • I want to drop the rows that are BOTH " main_df['MANUFACTURER']=='ELLICOM' and main_df['UPDATED']!=date_ "... i am trying to solve this for a week ...almost ready to give up – Priniotis Nov 30 '22 at 07:42