I need to loop through certain rows in my CSV file, for example, row 231 to row 252. Then I want to add up the values that I get from calculating every row and divide them by as many rows as I looped through. How would I do that?
I'm new to pandas so I would really appreciate some help on this.
I have a CSV file from Yahoo finance looking something like this (it has many more rows):
Date,Open,High,Low,Close,Adj Close,Volume
2019-06-06,31.500000,31.990000,30.809999,31.760000,31.760000,1257700
2019-06-07,27.440001,30.000000,25.120001,29.820000,29.820000,5235700
2019-06-10,32.160000,35.099998,31.780001,32.020000,32.020000,1961500
2019-06-11,31.379999,32.820000,28.910000,29.309999,29.309999,907900
2019-06-12,29.270000,29.950001,28.900000,29.559999,29.559999,536800
I have done the basic steps of importing pandas and all that. Then I added two variables corresponding to different columns to easily reference to just that column.
import pandas as pd
df = pd.read_csv(file_name)
high = df.High
low = df.Low
Then I tried doing something like this. I tried using .loc in a variable, but that didn't seem to work. This is maybe super dumb but I'm really new to pandas.
dates = df.loc[231:252, :]
for rows in dates:
# calculations here
# for example:
print(high - low)
# I would have a more complex calculation than this but
# but for simplicity's sake let's stick with this.
The output of this would be for every row 1-252 it prints high - low, for example:
...
231 3.319997
232 3.910000
233 1.050001
234 1.850001
235 0.870001
...
But I only want this output on a certain number of rows.
Then I want to add up all of those values and divide them by the number of rows I looped. This part is simple so you don't need to include this in your answer but it's okay if you do.