0

I have a df which have a column called Description and it has values like:

ID   Description
1    (a) this is good (b) bad (c) average
2    Ok
3    i am rahul works on (a) stack overflow (b) stack exchange

Expected DF:

 ID      Description
1        (a) this is good
1        (b) bad
1        (c) average
2        Ok
3        i am rahul works on (a) stack overflow
3        (b) stack exchange

I have seen answers like this and this which explodes column basis on separators but I need to explode based on bullet points.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Rahul Agarwal
  • 4,034
  • 7
  • 27
  • 51

1 Answers1

5

Here is one way, adapted from one of your links:

>>> df['Description'].str.split(r'.(?=\(.\))',expand = True).stack() 
ID
1   0       (a) this is good
    1                (b) bad
    2            (c) average
2   0                     Ok
3   0    i am rahul works on
    1     (a) stack overflow
    2     (b) stack exchange

Just using a lookahead regular expression to split on. The split on the 3d id is a bit different since you are not really splitting there.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
kabanus
  • 24,623
  • 6
  • 41
  • 74
  • @kabanus @jezrael: This works well!! Thanks!! Just one more thing if I there is else like `(i)` or `(ii)` but I dont want to split on these things!! I only want to split when there is a letter in between brackets and no numbers!!. Can this solution be tweaked – Rahul Agarwal Apr 15 '19 at 13:27
  • 1
    @RahulAgarwal Use `.(?=\([a-zA-Z]\))` - and lookup Python `re`. – kabanus Apr 15 '19 at 13:28