0

So, I've an excel file and that I have already converted in pandas dataframe, I've done some analysis on it already but there's an issue that I'm facing in it, that is is of how can I separate multiple values that are given in a same row, they're differentiated using a) name1 ; b) name2

As a beginner in pandas I'm unable to work upon a logic which can frame out the multiple values that are given in the column.

enter image description here

This is the dataset that I'm working on and I'm unsure how can I differentiate the multiple values that are given in the same row.

1 Answers1

1

You can use .str.split() to split the column into two and then .str.lstrip() to remove the (a) and (b):

>>> import pandas as pd
>>> df = pd.DataFrame({"Chronic medical conditions": ["(a) BP; (b) Diabetes", "(a) Diabetes; (b) high BP"]})
>>> df
  Chronic medical conditions
0       (a) BP; (b) Diabetes
1  (a) Diabetes; (b) high BP

>>> df = df["Chronic medical conditions"].str.split(';', expand=True)
>>> df.columns = ["a", "b"]  # rename columns as neccessary
>>> df
              a              b
0        (a) BP   (b) Diabetes
1  (a) Diabetes    (b) high BP

>>> df["a"] = df["a"].str.lstrip("(a) ")
>>> df["b"] = df["b"].str.lstrip(" (b)")
>>> df
           a         b
0         BP  Diabetes
1   Diabetes   high BP
ChrisOram
  • 1,254
  • 1
  • 5
  • 17
  • Thanks for the help but just 1 more query, actually few rows have more than 2 values like some have 4 values in them and some are having upto 5, so can I make this process dynamic because it is limited to 2 only, right? – Akshat Kulshreshtha Aug 17 '22 at 10:55
  • 1
    I see, to do this in a reasonable and efficient way I would assume some regex with `split` or `extract` would be required. – ChrisOram Aug 17 '22 at 11:00
  • Can you help me with that as well? I mean I'm quite new to all this stuff – Akshat Kulshreshtha Aug 17 '22 at 11:14
  • @AkshatKulshreshtha I would suggest creating another question which is more specific to this. Something like, "how to extract a variable number of string slices from columns in Pandas" and tag regex also - there are many pandas + regex wizards on StackOverflow. – ChrisOram Aug 23 '22 at 09:45