I am interested in a very 'Pandas-like' solution in the following problem. I have a straightforward implementation with looping over rows and checking conditions in both columns. I'm working on an NLP problem and need to locate tokens in sentences. I have two dataframes, one is the with start_0
and end_0
positions of tokens (drugs) and the second dataframe contains start_1
and end_1
of sentences. For example:
Position of tokens:
df_0 =
start_0 end_0 token
0 20 27 aspirin
1 50 59 trazodone
2 81 88 placebo
3 121 127 haldol
Position of sentences:
df_1=
start_1 end_1
0 0 17
1 17 29
2 29 46
3 46 64
4 64 76
5 76 81
6 81 97
7 97 227
I need to create a new column in df_1
and put in a corresponding row the token, namely:
df_1=
start_1 end_1 token
0 0 17 NaN
1 17 29 aspirin
2 29 46 NaN
3 46 64 trazodone
4 64 76 NaN
5 76 81 NaN
6 81 97 placebo
7 97 227 haldol
Simply match two data frame if the position of a token is within the sentence. There must a simple solution with Pandas functionality, rather then looping over rows and checking both boundaries.