In the following code, there are 2 dataframes that are identically labelled (recent_grads
and all_ages
):
majors = recent_grads['Major'].unique()
rg_lower_count = 0
for m in majors:
recent_grads_row = recent_grads[recent_grads['Major'] == m]
all_ages_row = all_ages[all_ages['Major'] == m]
rg_unemp_rate = recent_grads_row.iloc[0]['Unemployment_rate']
aa_unemp_rate = all_ages_row.iloc[0]['Unemployment_rate']
if rg_unemp_rate < aa_unemp_rate:
rg_lower_count += 1
print(rg_lower_count)
Why do I need the iloc[0]
part (on lines 7 and 8)? Since there is only 1 line at each series (recent grads row and all ages row) there is no need to specify on what lines I want to perform the comparison.
Yet, without it I get this error message:
ValueError: Can only compare identically-labeled Series objects