1

I have a python pandas Data Frame with 2 columns: X, which is an angle (181 evenly-spaced values between 0 and 360), and R1, which is an electrical resistance measured for a given angle. I need to add a column "R2", with the result of the following operation:

R2(X) = R1(X) - R1(180 - X).
Negative values of (180 - X) should map to (540 - X), so R1(180 - 182) == R1(358).

In practice, it means the "R2" column should be in the form (using row numbers):

R1[0] - R1[90]
R1[1] - R1[89]
R1[2] - R1[88]
...
R[90] - R[180]
R[91] - R[179]
R[92] - R[178]
...
R[180] - R[90]

My first guess was to to loop over rows' numbers with condition on which rows to take into account. After reading this question and also that one I realized this is not a preferred solution. df.apply() and df.shift() also doesn't seem to be a choice. Since none of the suggestions mentioned in the posts above seem to be straightforward in my case, I'd like to ask: is it possible to avoid "for" loop and what would be the best solution for this particular problem? Is it possible to select values from "R1" column based on the corresponding value from "X" column?

kuba_pol
  • 13
  • 5

2 Answers2

0

You can separate the column from the middle point; for each half, substract the reversed of it from itself, and at the end concatenate:

middle = int(np.ceil(len(df) / 2))

first_half  = df["R1"].iloc[:middle].to_numpy()
second_half = df["R1"].iloc[middle-1:].to_numpy()

df["R2"] = np.concatenate([first_half - first_half[::-1],
                           (second_half - second_half[::-1])[1:]])

Reason for using NumPy is because in pandas, arithmetic operations look at index, so subtracting the "reversed" Series from itself would be 0 due to index alignment (you can try df["R1"].iloc[:middle] - df["R1"].iloc[:middle][::-1] and see all zeros).

Mustafa Aydın
  • 17,645
  • 4
  • 15
  • 38
  • Thank you, it did the job. The only problem I noticed was with `first_half = df["R1"].iloc[:middle].to_numpy()`, which later gives `R1[0] - R1[90]` instead of `R1[0] - R1[91]`. I used `first_half = df["R1"].iloc[:middle + 1].to_numpy()`. It doubles a cell in `first_half` and `second_half` which needs to be skipped while concatenating. – kuba_pol Jan 24 '23 at 10:33
  • thanks for the correction. then it should be `middle+1:` in the `second_half`'s computation, right? – Mustafa Aydın Jan 24 '23 at 10:45
  • 1
    Sorry, I messed up with indices because I remove one row from my original data before performing subtraction. `first_half = df["R1"].iloc[:middle]` is correct, one needs to take `second_half = df["R1"].iloc[middle - 1:]`. After resetting the indices (so they go from 0 to 180), the row with index `[90]` should be included in _both_ halfs to get correct subtraction. It generates additional row with `R1[90] - R1[0]` which is (for this particular case) the same as `R1[90] - R1[180]`. When concatenating, you can skip the last element of `first_half` or the first element of `second_half'. – kuba_pol Jan 24 '23 at 15:10
-1
import pandas as pd

# create example dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4, 5],'B' : [2, 5, 8, 9, 7]})

# subtract value in current row from value in next row
df['A_minus_next'] = df['A'] - df['B'].shift(0)

print(df)

You can use the .shift() method

  • 1
    Thank you @shubham koli, bu it actually does not solve the problem. First, I need to subtract the value in row [91] from value in row [0], row [90] from row[1], row[89] from row [2] and so on. Second, .shift() leaves NaNs in some column, depending on the argument you pass to it. I also need to use values from the same column, like `df["B"] - df["B"].shift(0)` from you example. Let me know if the question was not clear. – kuba_pol Jan 23 '23 at 17:08
  • 1
    shift(0) does nothing really – Mustafa Aydın Jan 24 '23 at 15:14