Using a for loop to add values across different excel sheets but need separated values

Question

hi:) i am trying to make a for loop to reduce redundancy in my code, where i need to access a number of different sheets within an excel file, count the number of specific values and later plot a graph. my code for my for loop looks like this at the moment:

df = pd.read_excel('C:/Users/julia/OneDrive/Documents/python assignment/2016 data -EU values.xlsx', 
skiprows=6)
sheets_1 = ["PM10 ", "PM2.5", "O3 ", "NO2 ", "BaP", "SO2"]
resultM1 = 0
for sheet in sheets_1:
    print(sheet[0:5])
    for row in df.iterrows():
        if row[1]['Country'] == 'Malta':
            resultM1 += row[1]['AirPollutionLevel']
print(resultM1)

i would like for the output to look something like this: PM10 142 PM2.5 53 O3 21 NO2 3 BaP 21 SO2 32

but what i'm getting is just the sheet names printed after each other and the total amount of the sepcific value i need across all sheets. i.e. PM10 PM2.5 O3 NO2 BaP SO2 284.913786

i really need the values separated into their respective sheet and not added together.

attached is a screeshot of the excel file. as u can see, there are different sheets and many values within -i need to add values for a specific country in each sheet.

any help would be greatly appreciated!

can you share sample data from the dataframe so we know what's being processed — Joe Ferndz, Feb 12 '21 at 18:25
Also, it should be `for index, row in df.iterrows():` However, I recommend NOT to use for loop on dataframe. You should try to do this in other ways. Also read [How to iterate over rows in a DataFrame in Pandas](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) — Joe Ferndz, Feb 12 '21 at 18:26
is it possible to share the actual data in text format so we can try it. screenshot may not help — Joe Ferndz, Feb 12 '21 at 18:28
@JoeFerndz i dont think i can share the file, its quite large (3000 rows x 6 sheets) and wont let me convert it to a text file — Julia Micallef Filletti, Feb 12 '21 at 18:33
@CiaranOBrien i added the range in case it would be able to insert the summation next to each sheet, which i realise is incorrect — Julia Micallef Filletti, Feb 12 '21 at 18:35

Ciaran O Brien · Answer 1 · 2021-02-12T22:15:56.630

0

import pandas as pd
# Open as Excel object
xls = pd.ExcelFile('C:/Users/julia/OneDrive/Documents/python assignment/2016 data -EU values.xlsx')
# Get sheet names
sheets_1 = xls.sheet_names

# Dictionary of SheetNames:dfOfSheets
sheet_to_df_map = {}
for sheet_name in xls.sheet_names:
    sheet_to_df_map[sheet_name] = xls.parse(sheet_name)

# create list to store results
resultM1 = []
# Loop over keys and df in dictionary
for key, df in sheet_to_df_map.items():
    # remove top 5 blank rows
    df = df = df.iloc[5:]
    # set column names as first row values
    headers = df.iloc[0]
    df  = pd.DataFrame(df.values[1:], columns=headers)
    #loop over rows in the df and create pd series to store Malta results
    results  =df.loc[df['Country'] == "Malta", 'AirPollutionLevel']
    # Loop over the results for Malta from each sheet and append 'Malta' and then append the value to a list
    for i in results:
        resultM1.append(key)
        resultM1.append(i)

# Convert list to df
df = pd.DataFrame(resultM1) 
# Rename column
df = df.rename({0: 'Sheet'}, axis=1)  
# create two columns
final = pd.DataFrame({'Sheet':df['Sheet'].iloc[::2].values, 'Value':df['Sheet'].iloc[1::2].values})

edited Feb 12 '21 at 22:15

answered Feb 12 '21 at 18:40

Ciaran O Brien

374
3
13

thanks @Ciaran! that should have worked but it printed this instead: {'PM10 ': 32.0, 'PM2.5': 32.0, 'O3 ': 32.0, 'NO2 ': 32.0, 'BaP': 32.0, 'SO2': 32.0} and there's no way that each sheet has the same number of values, i think it may have counted the number of times the country was mentioned instead of counting the values associated with that country? – Julia Micallef Filletti Feb 12 '21 at 18:45
Is this data available online and I can take a look? @JuliaMicallefFilletti – Ciaran O Brien Feb 12 '21 at 18:49
my teacher provided us with this data, not sure where he got it from as i couldnt find it anywhere. its from World Health Organisation. i checked their site to see where he could've gotten it from, but cant find anything :( – Julia Micallef Filletti Feb 12 '21 at 18:52
@JuliaMicallefFilletti Try it now. The problem is your line `for row in df.iterrows():` because your are looking at row in original df, not row in sheet. – Ciaran O Brien Feb 12 '21 at 19:47

Using a for loop to add values across different excel sheets but need separated values

1 Answers1