I'd like to express the following on a pandas data frame, but I don't know how to other than slow manual iteration over all cells.
For context: I have a data frame with two categories of columns, we'll call them the read_columns and the non_read_columns. Given a column name I have a function that can return true or false to tell you which category the column belongs to.
Given a specific read column A:
For each row:
1. Inspect the read column A to get the value X
2. Find the read column with the smallest value Y that is greater than X.
If no read column has a value greater than X, then substitute the largest value
found in all of the *non*-read columns, call it Z, and skip to step 4.
3. Find the non-read column with the greatest value between X and Y and call its value Z.
4. Compute Z - X
At the end I hope to have a series of the Z - X values with the same index as the original data frame. Note that the sort order of column values is not consistent across rows.
What's the best way to do this?