Explode a row based on bullet points

Question

I have a df which have a column called Description and it has values like:

ID   Description
1    (a) this is good (b) bad (c) average
2    Ok
3    i am rahul works on (a) stack overflow (b) stack exchange

Expected DF:

 ID      Description
1        (a) this is good
1        (b) bad
1        (c) average
2        Ok
3        i am rahul works on (a) stack overflow
3        (b) stack exchange

I have seen answers like this and this which explodes column basis on separators but I need to explode based on bullet points.

`regex` might be helpful here – Sociopath Apr 15 '19 at 13:06 — Sociopath, Apr 15 '19 at 13:06

score 5 · Accepted Answer · edited Apr 15 '19 at 13:23

5

Here is one way, adapted from one of your links:

>>> df['Description'].str.split(r'.(?=\(.\))',expand = True).stack() 
ID
1   0       (a) this is good
    1                (b) bad
    2            (c) average
2   0                     Ok
3   0    i am rahul works on
    1     (a) stack overflow
    2     (b) stack exchange

Just using a lookahead regular expression to split on. The split on the 3d id is a bit different since you are not really splitting there.

edited Apr 15 '19 at 13:23

jezrael

822,522
95
1,334
1,252

answered Apr 15 '19 at 13:14

kabanus

24,623
6
41
74

@kabanus @jezrael: This works well!! Thanks!! Just one more thing if I there is else like `(i)` or `(ii)` but I dont want to split on these things!! I only want to split when there is a letter in between brackets and no numbers!!. Can this solution be tweaked – Rahul Agarwal Apr 15 '19 at 13:27
1

@RahulAgarwal Use `.(?=\([a-zA-Z]\))` - and lookup Python `re`. – kabanus Apr 15 '19 at 13:28

Explode a row based on bullet points

1 Answers1