I have a DataFrame as below with all three columns having strings:
Column A | Column B | Column C |
---|---|---|
Apple | red, yellow, blue | Texas, California |
Banana | yellow, orange | Indiana, New Zealand |
Watermelon | grey |
I would like to split each row into multiple rows based on a delimiter value. The final output would look like below.
Column A | Column B | Column C |
---|---|---|
Apple | red | Texas |
Apple | yellow | California |
Apple | blue | |
Banana | yellow | Indiana |
Banana | orange | New Zealand |
Watermelon | grey |
Note:
- Delimiters would only be present in Column B and Column C.
- The splits need not be equal. For instance, In row 1, the max splits across Column B and Column C are 3. But since column C has only 2 splits, the third row would result in column C as null/blank.
I tried following this stack overflow post, but couldn't apply it in my case here since each column would have different values based on splits across multiple columns, and since I am dealing with string splits.