0

I have a folder consisting of several excel documents.

For each file in the list I want to go to 3 specific sheets (that are present in each of the files) and copy these sheets into a new workbook.

So the it looks like this

Folder:
    File1
        Sheet1
        Sheet2
        ...
        Sheetn

    File2
        Sheet1
        Sheet2
        ...
        Sheetn


       ...


    Filen
        Sheet1
        Sheet2
        ...
        Sheetn

The sheet names are not titled like this but all have a similar naming structure so I want to write something like this:

new_file = excel workbook # create a new workbook (not sure the syntax)
for file in folder:
    open file
    for sheet in file:
        if sheetname like 'foobar1' or sheetname like 'foobar2'....:
           copy sheet into new_file
save new_file

The problem is I don't know what libraries to use or exactly how to write this properly.

I am not well versed with using python to interact with excel documents.

Any ideas?

  • 1
    Possible duplicate of [Excel Python API](https://stackoverflow.com/questions/6698229/excel-python-api) – LazyCoder Jul 28 '19 at 13:27
  • I don't think this is a duplicate. I am asking on how to do this rather than that question which is asking if it is possible. – chrisrever111 Jul 28 '19 at 13:53
  • Install xlrd library via this command: pip install xlrd. After that read each sheet or cell or xlx file by these commands: wb = xlrd.open_workbook('address') sheet = wb.sheet_by_index(0) # For row 0 and column 0 sheet.cell_value(0, 0) – BarzanHayati Jul 28 '19 at 18:53
  • Or, read via pandas library: import pandas as pd df = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx', sheet_name='Type here the name of your Excel sheet') print (df) – BarzanHayati Jul 28 '19 at 18:55
  • https://datatofish.com/read_excel/ https://www.geeksforgeeks.org/reading-excel-file-using-python/ – BarzanHayati Jul 28 '19 at 18:57

1 Answers1

0

At first, install all libraries that you need to read all excel files from your directories.

pip install xlrd
pip install pandas
pip install xlsxwriter

Then, import these to your code:

import os
import xlrd
import pandas as pd
import xlsxwriter
address='E:\\DataFrames\\CSV\\'
List_SubFolders=os.listdir(address)
number=0
with pd.ExcelWriter('E:/DataFrames/output.xlsx') as writer:  # doctest: +SKIP
    for folders in List_SubFolders:
        temp_folder=folders
        List_XLS_In_Directory=os.listdir(address+str(temp_folder))
        for xls in List_XLS_In_Directory:
            #print(address+str(temp_folder)+'\\'+str(xls))
            df = pd.read_excel(address+str(temp_folder)+'\\'+str(xls), sheet_name='Sheet1')
            #df = pd.read_excel('E:/DataFrames/CSV/1/EZ Apply GPA Calculator.xlsx', sheet_name='Sheet1')
            number+=1
            df.to_excel(writer, sheet_name='Sheet_name_'+str(number))

Finally, you have a excel file that has a sheet for each one. I read all excel files (or csv) in this directory.

BarzanHayati
  • 637
  • 2
  • 9
  • 22