I have a defined range:
df = pd.DataFrame([["1", "10"], ["11", "67"], ["90", "115"]], columns=['start', 'end'])
And a list of strings:
df2 = pd.DataFrame([["1"], ["3"], ["31"], ["70"], ["71"], ["90"], ["99"], ["100"], ["200"]], columns=['reference'])
And I try to get a result that looks as follows:
df3 = pd.DataFrame([["1", "1", "10"], ["3", "1", "10"], ["31", "11", "67"], ["70", "no range", "no range"],
["71", "no range", "no range"], ["90", "90", "115"], ["99", "90", "115"],
["100", "90", "115"], ["200", "no range", "no range"]], columns=['reference', "start", "end"])
I tried to do something similar earlier on, but with using numpy only. The solution then looked like this:
result_good=[]
result_bad=[]
for d in extension:
categories = np.logical_and(d >= ranges[:,1], d <= ranges[:,2])
if (ranges[:,0][categories]):
result_good.append(ranges)
else:
result_bad.append(d)
This basically worked. I want to get this to work with Pandas though. But all I get to work is to compare two dataframes of the same length or to do it "brute force" with a loop. There must be a more elegant way to do that. Thank you for your help.