1

Beginner coder here

I am trying to create multiple dataframes from multiple excel sheets in a single notebook with dataframe names being same as sheet names but I am unable to do so.

I have tried this but to no avail. Kindly help me on this.

file_name='file.xlsx'     
xl = pd.ExcelFile(file_name)
dfs = {sh:xl.parse(sh) for sh in xl.sheet_names}
for key in dfs.keys():
    dfs[key] = pd.DataFrame()

Expected Result is

excelbook contains sheet1 sheet2

I need to create two dataframes: sheet1 and sheet2
containing all the columns of sheet1 and sheet2 

result that I am getting is I am able to create dictionary having all the dataframe as key and their columns as values but I need them all seperately out of the dictionary. as

dfs[sheet1]
dfs[sheet2]

i created a loop like this

for key in dfs.keys():
    dfs[key] = pd.DataFrame()

but it is creating dataframe for the first key value pair only. df_sheet1 Kindly help me on this.

noodcoder
  • 55
  • 4
  • 1
    Welcome to SO, could you add to the question what error you are facing. What is the expected output and what is the output you are getting? – shoaib30 Jul 28 '21 at 10:01
  • as per your edit, you are looping and creating new empty dataframes and replacing your dataframes in the dictionary with them. – shoaib30 Jul 28 '21 at 11:06
  • yes I need to rectify this your code too is doing the same thing I create a loop that creates dataframes for each key value pairs not replace it with next in interation this has been the problem from the start that I am facing – noodcoder Jul 28 '21 at 11:22
  • that is how a loop is meant to work. What you are asking for and if I understand correctly is to create new variables inside the loop for each sheet. There may be hacks to do that in python (https://stackoverflow.com/questions/1373164/how-do-i-create-variable-variables) but I think you need to understand foundational concepts around variables and data structures. – shoaib30 Jul 28 '21 at 11:30

1 Answers1

1

You need to use the read_excel function to read a sheet from the excel

import pandas as pd

xls = pd.ExcelFile('sample.xlsx')
dfs = {sh: pd.read_excel(xls, sh) for sh in xls.sheet_names}

This will create a dictionary of DataFrames corresponding to each sheet in the Workbook.

Source: https://stackoverflow.com/a/26521726/5236575


Edit: Assuming you have sheet1 and sheet2 in your workbook, you can access them as

df_sheet1 = dfs['sheet1']
df_sheet2 = dfs['sheet2']
shoaib30
  • 877
  • 11
  • 24
  • Hi Shoaib I was able to the same but I need them individually but not as key- value pair inside dictionary. thanks for the help though – noodcoder Jul 28 '21 at 10:10
  • @noodcoder the dictionary can be used in a manner that is fit for your implementation, the scope is outside the question, but I have added a snippet for it, I would suggest looking into working with dictionaries in python – shoaib30 Jul 28 '21 at 10:15
  • HI Shoiab thanks for this but as per my question ``` for key in dfs.keys(): dfs[key] = pd.DataFrame() ``` is the part that I am stuck at . I require my sheets as individual dataframes not as key value pairs of the dictionary. – noodcoder Jul 28 '21 at 10:28
  • @noodcoder the variables `df_sheet1` and `df_sheet2` are individual dataframes. you can do it in a loop too `for sh in dfs: df = dfs[sh]` and you have the individual sheet as `df` inside the scope of the loop. I might not be understanding what you need correctly, but I would still recommend looking into how to work with dictionaries – shoaib30 Jul 28 '21 at 10:41
  • HI Shoaib thnaks for this but ``` for sh in dfs: df = dfs[sh]``` is providing the same issue that I came to the forum with the I am getting the first data frame and the loop is ending there I need to have name of the sheet as the dataframe name and all the dataframes. I know that dictionary works here but I am bounded by the requirement of individual dataframes outside of the dictionary. – noodcoder Jul 28 '21 at 10:46
  • @noodcoder how exactly do you expect the dataframe to be? In a list? in variables? at the end of the day, a dataframe is an object(in oops terminology) and you can have any data structure you need. – shoaib30 Jul 28 '21 at 11:04