0

I try to do a "excel v lookup" to merge 2 df based on a value. If the 1,st value is in the second then merge(see ex below). I tried many options here but no one works. I try to put the dates on corresponding numbers from column WO_ID and keep all numbers from this column.

I know how to transform string in dates but I let * and A maybe you can write complete code.

So:

Data Frame 1: cms

WO_ID          Status
14163-2018  1
14142-2018  1
2092-2019   2
2093-2019   2
11922-2019  1



Data Frame 2: df

WO                    Start     Finish
2092-2019; 2093-2019      12-07-19* 18-07-19
11922-2019            02-07-2019A   28-07-19*
14142-2018; 14163-2018    06-07-19  28-07-19*

Beside many other codes tried, I think this is more close to result...but obviously does not work.

I try to create a 3rd db that contain merged cms+db (let's call it "all")

This is the code:

all = pd.merge(cms,df,left_on='WO_ID',right_on='WO',how='left',indicator=True)

This is the result that obviously is not complete:

WO_ID         Statu     WO      Start         Finish        _merge
14163-2018  1   NaN     NaT           NaT          left_only
14142-2018  1   NaN     NaT           NaT          left_only
2092-2019   2   NaN     NaT       NaT          left_only
2093-2019   2   NaN     NaT       NaT          left_only
11922-2019  1   11922-2019  02-07-2019A   28-07-19* both

I try to obtain output like this:

WO_ID        Status  WO                      Start    Finish    _merge
14163-2018  1    14142-2018; 14163-2018  06-07-19    28-07-19* whatever..
14142-2018  1    14142-2018; 14163-2018  06-07-19    28-07-19* whatever..
2092-2019   2    2092-2019; 2093-2019    12-07-19*   18-07-19  whatever..
2093-2019   2    2092-2019; 2093-2019    12-07-19*   18-07-19  whatever..
11922-2019  1    11922-2019              02-07-2019A 28-07-19* both

But I don't know how.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Ionut
  • 1
  • 1

2 Answers2

0

Data Frame 1: cms

enter image description here

Data Frame 2: df

enter image description here

Formula and Result enter image description here

Expected Result

enter image description here

Sasikumar Murugesan
  • 4,412
  • 10
  • 51
  • 74
Ionut
  • 1
  • 1
  • this is excel based solution. let me know if the python solution also worked for you. – MNA Jul 11 '19 at 06:01
0

The reason your merge is not working is because your key is not the same in both the dataframes. One way you can do is that create a new key for the second dataframe df. The way you can do is using the following solution https://stackoverflow.com/a/39946744/8329101

df['key'] = df['WO']
df_exploded = tidy_split(df, 'key', sep=';')

Then apply the join operation

all = pd.merge(cms,df_exploded ,left_on='WO_ID',right_on='key',how='left')
MNA
  • 183
  • 2
  • 8
  • Thank you very much for support. We are close but I still get an error: File "/home/inecweb/DB.py", line 37, in df_exploded = tidy_split(df, 'key', sep=';',keep=False) NameError: name 'tidy_split' is not defined – Ionut Jul 03 '19 at 13:01
  • You have to copy paste the definition of tidy_split function from the link that I have provided above. Once you define that, it should work. – MNA Jul 04 '19 at 05:34