Loop through rows of dataframe at specific row values

Question

My dataframe contains three different replications for each treatment. I want to loop through both, so I want to loop through each treatment, and for each treatment calculate a model for each replication. I managed to loop through the treatments, but I need to also loop through the replications of each treatment. Ideally, the output should be saved into a new dataframe that contains 'treatment' and 'replication'. Any suggestion?

The dataframe (df) looks like this:

 treatment replication time  y
  **8          1          1   0.1**
  8          1          2   0.1 
  8          1          3   0.1
  **8          2          1   0.1**
  8          2          2   0.1 
  8          2          3   0.1
  **10         1          1   0.1**
  10         1          2   0.1 
  10         1          3   0.1
  **10         2          1   0.1**
  10         2          2   0.1 
  10         2          3   0.1

for i, g in df.groupby('treament'):
   k = g.iloc[0].y                                   
   popt, pcov = curve_fit(model, x, y)
   fit_m = popt

I now apply iterrows, but then I can no longer use the index of NPQ [0] to get the initial value. Any idea how to solve this? The error message reads as:

for index, row in HL.iterrows():
  g = (index, row['filename'], row['hr'], row['time'], row['NPQ'])
  k = g.iloc[0]['NPQ'])

AttributeError: 'tuple' object has no attribute 'iloc'

Thank you in advance

It's possible to do it without looping, hence improving the time efficiency of your code. We just need to know how do you define `x` and `y` (the arguments of `curve_fit`) — Ralubrusto, Jan 22 '21 at 21:11
Keep this in mind, in general, with pandas, trying to solve a problem with a loop is the incorrect implementation. See [How to iterate over rows in a DataFrame in Pandas](https://stackoverflow.com/q/16476924/7758804) & [Fast, Flexible, Easy and Intuitive: How to Speed Up Your Pandas Projects](https://realpython.com/fast-flexible-pandas/) — Trenton McKinney, Jan 22 '21 at 22:15
@Ralubrusto, I define x= time, and y= y in dataframe. Thank you in advance — Martina Lazzarin, Jan 23 '21 at 18:34
@TrentonMcKinney please see my update in the question: I used iterrows, but then I cannot make the previous code work. Any advice? Thank you! — Martina Lazzarin, Jan 23 '21 at 19:38
Please show what your expect output is, given the sample dataframe. — Trenton McKinney, Jan 23 '21 at 20:03

score 0 · Answer 1 · answered Jan 27 '21 at 16:18

0

grouped_df = HL.groupby(["hr", "filename"])

for key, g in grouped_df:
   k = g.iloc[0].y                                   
   popt, pcov = curve_fit(model, x, y)
   fit_m = popt

answered Jan 27 '21 at 16:18

Martina Lazzarin

179
1
8

Loop through rows of dataframe at specific row values

1 Answers1