1

This program works well for reading 1st sheet of the excel file, while I am trying to read all the sheets of the excel file I cannot read all the sheets.

python

while I add sheet_name = None in the reading excel line I am getting error like this I can not find out what is the error and how to rectify the error.

(scrap-3-HFZx_P-py3.9) PS F:\mohan> & C:/Users/hp/AppData/Local/pypoetry/Cache/virtualenvs/scrap-3-HFZx_P-py3.9/Scripts/python.exe f:/mohan/main.py
ENTER THE LIST HERE : userid,city,state
enter the full path to the file : F:\\mis\\KB.xlsx
Traceback (most recent call last):
  File "f:\mohan\main.py", line 16, in <module>     
    print(obj.extract(file_name))
  File "f:\mohan\main.py", line 9, in extract       
    return raw_excel[conf].to_dict(orient='records')
TypeError: unhashable type: 'list'

  • If use one sheetname, it means instead `userid,city,state` only `userid` solution working? – jezrael Jan 11 '22 at 06:26
  • `raw_excel[conf].to_dict(orient='records')` this line is probably wrong ... you mean `raw_excel[self.config].to_dict(orient='records')` ? – asifzuba Jan 11 '22 at 06:28
  • what is expected ouput if pass multiple `sheetname`s ? dictionary of each sheetname? – jezrael Jan 11 '22 at 06:31
  • @jezrael my expected output is list of dictionaries that containing all sheets specified column values –  Jan 11 '22 at 06:34
  • @asifzuba if change conf to config im getting the same error Traceback (most recent call last): File "f:\mohan\main.py", line 16, in print(obj.extract(file_name)) File "f:\mohan\main.py", line 9, in extract return raw_excel[self.config].to_dict(orient='records') TypeError: unhashable type: 'list' –  Jan 11 '22 at 06:36
  • Yes, that won't solve your error, but it is definitely wrong... you are getting away here because you have `conf` defined outside your class. Please see the answers for more clarity on your problem. Thanks! – asifzuba Jan 11 '22 at 06:42

2 Answers2

1

Solution:

#output is nested lists of list of dictionaries
def extract(self, file_name):
    raw_excel=pd.read_excel(file_name,sheet_name=None)
    return [v[v.columns.intersection(conf)].to_dict(orient='records')
                  for k, v in raw_excel.items()]

Explanation:

If use sheet_name=None output is dictionary of DataFrames, here raw_excel.

If need loop by dict here is used list comprehension with method items, so v is values and k for keys.

For filter only columns from DataFrame if exist in conf is used Index.intersection.

Last is used to_dict, so get for each DataFrame list of dictionaries. Final output, in another words return get lists of list of dictionaries.

If need flatten ouput is possible use this solution:

flat_list = [item for sublist in t for item in sublist]

So code is changed:

#output is flatten list of dictionaries
def extract(self, file_name):
    raw_excel=pd.read_excel(file_name,sheet_name=None)
    return [x for k, v in raw_excel.items() 
               for x in v[v.columns.intersection(conf)].to_dict(orient='records')]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0
  • This is a bit quirky, but when you do sheet_name=None, the returned type is not a pandas DataFrame but a dict. I personally don't like when return type changes with different arguments but it is what it is

  • However, this solves the mystery of the type error. Because you are now getting a dict back, and so when you subset with a list ...it can't do that operation and will give you a Unhashable Type error.

  • I would suggest that you have a conditional which checks for type and then does some extra processing.

  • Let me know if you need an elaborate example for type checking.

Dharman
  • 30,962
  • 25
  • 85
  • 135
asifzuba
  • 450
  • 3
  • 7
  • @MohanrajM - list comprehension not working? – jezrael Jan 11 '22 at 06:46
  • That's not working that change shows key error.. ' Traceback (most recent call last): File "f:\mohan\main.py", line 16, in print(obj.extract(file_name)) File "f:\mohan\main.py", line 9, in extract return [raw_excel[x].to_dict(orient='records') for x in conf] File "f:\mohan\main.py", line 9, in return [raw_excel[x].to_dict(orient='records') for x in conf] KeyError: 'col_RequestID' ' –  Jan 11 '22 at 06:50