I have a data frame DF1 that contains a start index and end index. I want to use these indices to create multiple slices of another data frame DF2.
I don't want to iterate through the whole dataset, one way I thought was to create an extra column on DF2 that will tell me where each slice starts and ends with an integer that changes on every range. This integer should also be included in DF1. But I am not sure how I would go around doing this or if there is a way to get multiple subsets of the data frame as a list or something.
Some things to consider:
- DF1 is always smaller than DF2
- I want to try do this without iterrows
- I want this to work on variable sizes of DF1 and DF2
Something like this:
In [1]: DF1 = pd.DataFrame([[1, 2], [5, 8], [10, 12]], columns=['StartIndex','EndIndex'])
In [2]: DF2 = pd.DataFrame([1,2,3,4,5,10,11,12,13,14,15,16,17,18,19,20], columns=['SomeValue'])
#The result should be either multiple data frames or something like this:
#DF2
SomeValue Slice
0 1 NA
1 2 1
2 3 1
3 4 NA
4 5 NA
5 10 2
6 11 2
7 12 2
8 13 2
9 14 NA
10 15 3
11 16 3
12 17 3
13 18 NA
14 19 NA
15 20 NA
#DF1
StartIndex EndIndex Slice
0 1 2 1
1 5 8 2
2 10 11 3