For loop over dataframe python

Question

i have dataframe called df_civic with columns - state ,rank, make/model, model year, thefts. I want to calculate AVG and STD of thefts for each model year.

All years that are in dataframe are taken with: years_civic = list(pd.unique(df_civic['Model Year']))

My loop looks like this:

for civic_year in years_civic:
    f = df_civic['Model Year'] == civic_year
    civic_avg = df_civic[f]['Thefts'].mean()
    civic_std = df_civic[f]['Thefts'].std()
    civic_std= np.round(car_std,2)
    civic_avg= np.round(car_avg,2)
    print(civic_avg, civic_std, np.sum(f))

However output is not what i need, only output that is correct is the one from np.sum(f)

Now output looks like this:

9.0 20.51 1
9.0 20.51 1
9.0 20.51 1
9.0 20.51 1
9.0 20.51 13
9.0 20.51 15
9.0 20.51 3
9.0 20.51 2

Please include sample data and format your question according to tips provided in this post: https://stackoverflow.com/a/20159305 — navneethc, Jan 06 '21 at 17:22
@Aleksander, you can use triple ``` code ```, to mark a code block over multiple lines. Took a while to edit your 100s of code blocks and
s :) .. Also, you can simple move a new line to another line without using
. Its allowed in markdown. Check how I edited your question to format your question better next time. Cheers. — Akshay Sehgal, Jan 06 '21 at 17:22

Nicoowr · Accepted Answer · 2021-01-06T17:46:56.577

1

Pandas provides many powerful functions for aggregating your data. It's usually better to first think of these functions before using for loops.

For instance, you can use:

import pandas as pd
import numpy as np

df_civic.groupby("Model Year").agg({"theft": ["mean", np.std]})

More doc here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.agg.html

Regarding your code, there is something weird, car_std and car_avg are not defined.

edited Jan 06 '21 at 17:46

answered Jan 06 '21 at 17:21

Nicoowr

770
1
10
29

Why hurry? You haven't told the OP where he might be going wrong, and while yours likely a better solution, why offer an untested one? – navneethc Jan 06 '21 at 17:26
@navneethc can you indicate what I'm doing wrong, I'd love to know that as well honestly. – Aleksander Kuś Jan 06 '21 at 17:30
@AleksanderKuś Can you post sample data? – navneethc Jan 06 '21 at 17:44
1

I checked the code, loop itself is correct - it's working, I didn't define car_std and car_avg correctly and it took value from other loop. Thanks guys. – Aleksander Kuś Jan 06 '21 at 18:08

For loop over dataframe python

1 Answers1