So, basically I have two dataframes one is df1
and df2
so each size is [9000000 rows X 11 columns] and [2700000 rows X 18 columns].
Following df1
looks like this,
| Index | Frame | Chunk_Ids |...|P_filter|
|:------|:-----:|:---------:|:-:|:------:|
| 1 | 1 | 20001 |...| 0.142|
| 2 | 1 | 20001 |...| 0.146|
| 3 | 1 | 20001 |...| 0.147|
| . | . | . |...| .|
| 900000| 260000 | 200020 |0.245| 0.142|
Following `df2` looks like this,
| Index | Frame | Chunk_Ids |...| P_filter|
|:------|:------:|:---------:|:-:|:-------:|
| 3 | 1 | 19999 |...| 0.142|
| 6 | 1 | 20001 |...| 0.146|
| 7 | 1 | 20001 |...| 0.147|
| . | . | . |...| .|
| 270000| 260000 | 200019 |...| 0.142|
I have two questions, how to merge two dataframes with below condition,
- Merge if both df contain common columns chunk IDs without missing any data
- Compare df1 index with df2 index, and if the particular index is missing then replace whole respective df2 row with '-1'.
I tried and burnt my head out from last two days with many methods but I can't satisify both conditions. Any leads will be appreciated.