1

I have written a small python program to deal with orders I receive. But due to change in the reference value I used to merge my df on is changed.

Example:

Data

The problem is, I used to merge Old PO numbers to new PO numbers so that I know a customer is returning. But recently I started receiving partial orders which are 3c456 instead of 3c456/4c345. It means my customer is renewing only one product now. My pd.merge code doesn't find the partial order anymore.

My code was

df = pd.merge(old_PO_file, New_PO_file, on =PO-number)

since the new PO is partial and could be the value before or after the slash. I would like to duplicate all the rows in PO-number column if there is a Slash(/) in that cell.

Please let me know if it is possible.

Thanks

Maneet Giri
  • 185
  • 3
  • 18
  • 1
    Please share the code in question – Patrick Haugh Jan 05 '17 at 13:56
  • Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Jan 05 '17 at 14:00
  • @PatrickHaugh Just added the code. It used to be a simple pd.merge code. – Maneet Giri Jan 05 '17 at 14:07
  • So you want to make the `3c456/4c345` row into two rows a `3c456` row and a `4c345` row? – Patrick Haugh Jan 05 '17 at 14:10
  • @PatrickHaugh Yes, you are right – Maneet Giri Jan 05 '17 at 14:13

1 Answers1

2

You can use str.split with stack for creating Series for join to original DataFrame:

print (df)
   Sold-to    PO-Number Sold-to-party
0     1234       6d1234   abc company
1     2345  3c456/4c345   edf company
2     5679        9f639   mno company

df1 = df.drop('PO-Number', axis=1) \
        .join(df['PO-Number'] \
        .str \
        .split('/', expand=True) \
        .stack() \
        .reset_index(level=1, drop=True).rename('PO-Number')) \
        .reset_index(drop=True)

print (df1)
   Sold-to Sold-to-party PO-Number
0     1234   abc company    6d1234
1     2345   edf company     3c456
2     2345   edf company     4c345
3     5679   mno company     9f639
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252