MRE:
df = pd.DataFrame({"title":["Canada,Chris,Data Scientist", "Korea,Kim,Analyst", "HK,Lai,Scientist"],
"R":[0.7, 0.2, 0.3]})
My goal is to vectorize separating title
column into country
, name
, job
column.
Current method is:
df["country"] = df["title"].apply(lambda x:x.split(",")[0])
df["name"] = df["title"].apply(lambda x:x.split(",")[1])
df["job"] = df["title"].apply(lambda x:x.split(",")[2])
successfully outputs
title R country name job
0 Canada,Chris,Data Scientist 0.7 Canada Chris Data Scientist
1 Korea,Kim,Analyst 0.2 Korea Kim Analyst
2 HK,Lai,Scientist 0.3 HK Lai Scientist
However operation is not vectorized.
Usually vectorizing string operations would be:
df["title"].str.split(",")
but I cannot select one element for each list in Series.