0

I'm having some trouble with pandas. I opened a .xlsx file with pandas, but when I try to filter any information, it shows me the error

AttributeError: 'dict' object has no attribute 'head' #(or iloc, or loc, or anything else from DF/pandas)#

So, I did some research and realized that my table turned into a dictionary (why?).

I'm trying to convert this mess into a proper dictionary, so I can convert it into a properly df, because right now, it shows some characteristics from both. I need a df, just it.

Here is the code:

import pandas as pd

df = pd.read_excel('report.xlsx', sheet_name = ["May"])
print(df)

Result: it shows the table plus "[60 rows x 24 columns]"

But when I try to filter or iterate, it shows all dicts possible attibute errors.

Somethings I tried: .from_dict, xls.parse/(df.to_dict). When I try to convert df to dict properly, it shows

ValueError: If using all scalar values, you must pass an index 

I tried this link: [https://stackoverflow.com/questions/17839973/constructing-pandas-dataframe-from-values-in-variables-gives-valueerror-if-usi)][1], but it didn't work. For some reason, it said in one of the errors that I should provide 2-d parameters, that's why I tried to create a new dict and do a sort of 'append', but it didn't work too...

Then I tried all stuff to set an index, but it doesn't let me rename columns because it says .iloc is not an attribute from dict)

I'm new in python, but I never saw a 'pd.read_excel' open a DataFrame as 'dict'. What should I do?

tks! [1]: Constructing pandas DataFrame from values in variables gives "ValueError: If using all scalar values, you must pass an index"

Rodalm
  • 5,169
  • 5
  • 21
ana_m
  • 1
  • 1
    Please read [the documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html): `pd.read_excel` returns a "DataFrame or dict of DataFrames from the passed in Excel file. **See notes in sheet_name argument for more information on when a dict of DataFrames is returned.**" – ddejohn Jun 15 '22 at 20:02
  • if `df = pd.read_excel('report.xlsx', sheet_name = ["May"])` is actually a `dict` of `DataFrames` you might be able to try `pd.concat(df.values(), names=df.keys())` – Jason Leaver Jun 16 '22 at 01:50

1 Answers1

0

if its a dict of DataFrames try...

>>> dict_df = {"a":pd.DataFrame([{1:2,3:4},{1:4,4:6}]), "b":pd.DataFrame([{7:9},{1:4}])}
>>> dict_df
{'a':    1    3    4
0  2  4.0  NaN
1  4  NaN  6.0, 'b':      7    1
0  9.0  NaN
1  NaN  4.0}
>>> pd.concat(dict_df.values(),keys=dict_df.keys(), axis=1)
   a              b     
   1    3    4    7    1
0  2  4.0  NaN  9.0  NaN
1  4  NaN  6.0  NaN  4.0
Jason Leaver
  • 286
  • 2
  • 11
  • Thank you! So, I guess I was just lucky for this mistake have not happened before. I just fixed it by using df1 = pd.DataFrame(df) right after open the file. Anyway, there are good informations in your method for me to study. Thank you! – ana_m Jun 16 '22 at 21:40