I have a DF as shown below:
DF =
id token argument1 argument2
1 Tza Tuvia Tza Moscow
2 perugia umbria perugia
3 associated the associated press Nelson
I now want to compare the values of the columns argumentX
and token
and choose the value for the new column ARG
accordingly.
DF =
id token argument1 argument2 ARG
1 Tza Tuvia Tza Moscow ARG1
2 perugia umbria perugia ARG2
3 associated the associated press Nelson ARG1
Here is what I tried:
conditions = [
(DF["token"] == (DF["Argument1"])),
DF["token"] == (DF["Argument2"])]
choices = ["ARG1", "ARG2"]
DF["ARG"] = np.select(conditions, choices, default=nan)
This only compares the entire String and matches if they are identical. Constructions such as .isin
, .contains
or using a helper column such as DF["ARG_cat"] = DF.apply(lambda row: row['token'] in row['argument2'],axis=1)
did not work. Any ideas?