-1

I have a data frame having 4 columns, 1st column is equal to the counter which has values in hexadecimal.

Data

counter       frequency     resistance      phase
0          15000.000000   698.617126    -0.745298
1          16000.000000   647.001708    -0.269421
2          17000.000000   649.572265    -0.097540
3          18000.000000   665.282775     0.008724
4          19000.000000   690.836975    -0.011101
5          20000.000000   698.051025    -0.093241
6          21000.000000   737.854003    -0.182556
7          22000.000000   648.586792    -0.125149
8          23000.000000   643.014160    -0.172503
9          24000.000000   634.954223    -0.126519
a          25000.000000   631.901733    -0.122870
b          26000.000000   629.401123    -0.123728
c          27000.000000   629.442016    -0.156490

Expected output

| counter | sampling frequency | time.   |
| --------| ------------------ |---------|
| 0       |  -                 |t0=0     |
| 1       |  1                 |t1=t0+sf |
| 2       |  1                 |t2=t1+sf |
| 3       |  1                 |t3=t2+sf |

The time column is the new column added to the original data frame. I want to plot time in the x-axis and frequency, resistance, and phase in y-axis.

1 Answers1

1

Because in order to calculate the value of any row you need to calculate the value of the previous row before, you may have to use a for loop for this problem.

For a constant frequency, you could just calculate it in advance, no need to operate in the datafame:

sampling_freq = 1

df['time'] = [sampling_freq * i for i in range(len(df))]

If you need to operate in the dataframe (let's say the frequency may change at some point), in order to call each cell based on row number and column name, you can this suggestion. Syntax would be a lot easier using both numbers for row and column, but I prefer to refer to 'time' instead of 2.

df['time'] = np.zeros(len(df))

for i in range(1, len(df)):
    df.iloc[i, df.columns.get_loc('time')] = df.iloc[i-1, df.columns.get_loc('time')] + df.iloc[i, df.columns.get_loc('sampling frequency')]

Or, alternatively, resetting the index so you can iterate through consecutive numbers:

df['time'] = np.zeros(len(df))
df = df.reset_index()

for i in range(1, len(df)):
    df.loc[i, 'time'] = df.loc[i-1, 'time'] + df.loc[i, 'sampling frequency']

df = df.set_index('counter')

Note that, because your sampling frequency is likely constant in the whole experiment, you could simplify it like:

sampling_freq = 1

df['time'] = np.zeros(len(df))

for i in range(1,len(df)):
    df.iloc[i, df.columns.get_loc('time')] = df.iloc[i-1, df.columns.get_loc('time')] + sampling_freq

But it's not going to be better than just create the time series as in the first example.

Ignatius Reilly
  • 1,594
  • 2
  • 6
  • 15
  • Thank you for the suggestion. But it throws the error "iloc cannot enlarge its target object". So, I tried to fix with using loc. But doesn't help. I am new to this, how can I fix this? – user19810659 Aug 23 '22 at 09:37
  • 1
    @user19810659 I added another way that uses `loc`, which [should handle that error](https://stackoverflow.com/questions/64139881/pandas-error-indexerror-iloc-cannot-enlarge-its-target-object). But take into account that the first one It works with the dummy dataframe I was using, so maybe you should try to find out why it could be necessary to add more cells to the table you have. – Ignatius Reilly Aug 23 '22 at 17:25
  • 1
    Also, for the case of constant sampling frequency, use the first solution. It should be more efficient. – Ignatius Reilly Aug 23 '22 at 17:28