I am working with a panda dataframe where I have lists as individual cell elements (for few columns). I want to check a condition for every element in a list in one column and select the corresponding list element from other column. I know this can be easily done using zip command by something like:
p = 5 ; q = 6;
DF['Column3'] = [[b for a, b in zip(x, y) if a > p and a <q ] for x, y in zip(DF['Column1'], DF['Column2'])]
However I am not sure how to work with percentiles here i.e. instead of a fixed p and q I want to use some percentile value of the list (say 50 (equivalent to p) percentile to 90 percentile (equivalent to q)).
So for every cell (each cell consisting a list) in a column it should calculate the precentile values and check for corresponding list elements from other list (from corresponding cell) in the other column.
To explain the problem with example (let's say DF):
A | B | Column1 | Column2 | |
---|---|---|---|---|
0 | 3.4 | 5.7 | [2.1, 2.9, 5.2, 6.8] | [2.5,3.4,1.2,5.1] |
1 | 4 | 1.7 | [1.1, 2.5, 5.6, 11.5, 15.6, 21.5] | [12.15,1.58,5.4,1.2,34.2,67.2] |
50-Percentile DF['Column1'][0] is 4.05 and 90 percentile is 6.32. So, in Column1 third value satisfies this condition. Corresponding to this value in Column2 is 1.2. Hence Column3 should have an output list with 1.2. Similar procedure for next row too (p = 8.55, q =18.55) :
A | B | Column1 | Column2 | Column3 | |
---|---|---|---|---|---|
0 | 3.4 | 5.7 | [2.1, 2.9, 5.2, 6.8] | [2.5,3.4,1.2,5.1] | [1.2] |
1 | 4 | 1.7 | [1.1, 2.5, 5.6, 11.5, 15.6, 21.5] | [12.15,1.58,5.4,1.2,34.2,67.2] | [1.2,34.2] |