AttributeError: 'DataFrame' object has no attribute 'value_counts'

Question

I've started using pandas this week so pardon me if I'm asking a really obvious question. I am trying to get pandas to loop through all the sheets in my Excel wb to extract a specific column, column F for counts of occurances via value_counts.

Previously I used value_counts and typed in the specific name of the worksheet and it worked in pulling out the value count of that 1 sheet. However, the minute I replaced the sheet name to df1, it stops working.

df = pd.ExcelFile("filepath.xlsx")
for df1 in df.sheet_names:
    df2 = pd.read_excel("filepath.xlsx", sheet_name=df1, usecols="F")
    df2.dropna(inplace=False)
    print (df2.value_counts())

I expect the output to be the unique values and their occurance numbers but it returns:

AttributeError: 'DataFrame' object has no attribute 'value_counts'

Can somebody please help me?

The [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html) mentions this function for a `Series`. If you want to get it for multiple columns, use [`apply`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html). [*Some discussions about*](https://stackoverflow.com/questions/32589829/how-to-get-value-counts-for-multiple-columns-at-once-in-pandas-dataframe) — Alexandre B., Aug 04 '19 at 18:23
@Trenton_M Hi! does this solution work for specific columns in all the excel sheets in the workbook? — MeHasTooManyQueries..., Aug 04 '19 at 19:20
replace `df2.value_counts()` with `df2.apply(pd.Series.value_counts)` inside the loop — Trenton McKinney, Aug 04 '19 at 19:27

Valdi_Bo · Accepted Answer · 2019-08-04T19:40:16.043

0

You have a loop running on each sheet in the source Excel file. It is OK, but you should apply value_counts separately to each DataFrame (read from each sheet).

And a remark concerning variable names: Use df for DataFrames only. Neither Excel file nor sheet name are DataFrames.

So the loop should be somenthing like:

exFile = pd.ExcelFile("filepath.xlsx")
for sheet in exFile.sheet_names:
    df = pd.read_excel("filepath.xlsx", sheet_name=sheet, usecols="F")
    df.dropna(inplace=True)
    print('Sheet', sheet)
    print(df.apply(pd.Series.value_counts))

edited Aug 04 '19 at 19:40

answered Aug 04 '19 at 19:18

Valdi_Bo

30,023
4
23
41

Omg I was under the impression df was needed for all pd for like the entire wk LOL – MeHasTooManyQueries... Aug 04 '19 at 19:46

AttributeError: 'DataFrame' object has no attribute 'value_counts'

1 Answers1