How to use list.append() in pandas loops?

Question

I have 2 questions:

I have a dataset that contains some duplicate IDs, but some of them have different actions so they can't be removed. I want for each ID to do some math and store the final value to work with later. I already have duplicate indices, but in this code, it doesn't work properly and gives NaN.

How can I write nested loop using pandas? Cause it takes too much time to run. I've already used iterrows(), but didn't work.

   l_list = []
 for i in range(len(idx)):
     for j in range(len(idx[i])):
         if df.at[j,'action'] == 0:
             a = df.rank[idx[i]]*50
             b = df.study_list[idx[i]].str.strip('[]').str.split(',').str.len()
             l_list.append(a + b)

Please post an example from your input dataframe and the expected output. — navneethc, Jun 13 '21 at 14:43
@navneethc I made an example and added an image. For example for ID = aaa, if its action is 0, I want its rank * 50 + the number of items in the study_list, which is 2. Then for other IDs = aaa with action = 0, doing the same and finaly have a value for this ID to work with later. I want to do this for all the IDs and have their assigned value. — Elahe, Jun 13 '21 at 14:59
In the future, please use the recommendations given in https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples to post questions about Pandas. — navneethc, Jun 13 '21 at 15:25
@navneethc thanks a lot. I'm sorry about that, I'm new to this community and didn't know the rules exactly. Thank you so much for helping. — Elahe, Jun 13 '21 at 15:36

score 0 · Answer 1 · answered Jun 13 '21 at 15:19

i dont know what does the variable idx or anything. i think your code is wrong, you have to try this code

l_list = []
for i in range(len(idx)):
 for j in range(len(idx[i])):
     if df.at[j,'action'] == 0:
         a = df.rank[idx[i]]*50
         b = df.study_list[idx[i]].str.strip('[]').str.split(',').str.len()
         l_list.append(a + b)

score 0 · Accepted Answer · answered Jun 13 '21 at 15:24

Based on my understanding of what you've provided, see if this works:

In [15]: df
Out[15]:
    ID  rank  action    study_list
0  aaa    24       0        [a, b]
1  bbb     6       1     [1, 2, 3]
2  aaa    14       0  [1, 2, 3, 4]

In [16]: def do_thing(row):
    ...:     if row['ID'] == 'aaa' and row['action'] == 0:
    ...:         return row['rank'] * 50 + len(row['study_list'])
    ...:     else:
    ...:         return 100 * row['rank']
    ...:

In [17]: df['new_value'] = df.apply(do_thing, axis=1)

In [18]: df
Out[18]:
    ID  rank  action    study_list  new_value
0  aaa    24       0        [a, b]       1202
1  bbb     6       1     [1, 2, 3]        600
2  aaa    14       0  [1, 2, 3, 4]        704

NOTE: I have made many simplifications as your post doesn't enable a reproducible case. Read this thread to see how to best ask questions about Pandas. I also can't guarantee speed as you have not provided the details regarding the size of the dataset.

How to use list.append() in pandas loops?

2 Answers2