I have some files in a folder. most but not all in excel format. The files names are all like this:
doe_jane1234_question_12345_123456_JaneDoe.xlsx doe_john6578_question_56789_567890_JohnDoe.xlsx
Using Python, I'd like to open each excel file, grab the first sheet, and paste it into a new excel file and then rename the newly created sheets in the new file to
doe_jane
doe_john
The new Excel file will contain multiple sheets, each corresponding to a file the sheet was copied from.
This is the code I have come up with so far: first I create an empty excel file called output.xlsx in my directory.
import pandas as pd
import os
import re
cwd = os.path.join("C:\\directory")
files = os.listdir(cwd)
outputpath = os.path.join(cwd, 'output.xlsx')
for i in files:
# sheet name is the string before the firs number
sheetname = re.search(r'^[^\d]*', i)
input_file_path = os.path.join(cwd,i)
df_in = pd.read_excel(input_file_path )
with pd.ExcelWriter(outputpath , engine="openpyxl", mode='a') as writer:
df_in.to_excel(writer, sheet_name=sheetname)
when I run this I get an error:
File "C:\Anaconda3\lib\site-packages\openpyxl\workbook\child.py", line 90, in title
m = INVALID_TITLE_REGEX.search(value)
TypeError: expected string or bytes-like object