I have a Jupyter notebook that is pulling data from Athena into a Dataframe and then inserting that dataframe into an Excel Workbook using xlsxwriter
The dates in excel require the custom formatting [$-en-US,1]dd-mm-yy
which is being applied but it doesn't appear visually in excel until the cell has been double clicked. eg. they appear as 2023-07-10 but as soon as you double click the cell the formatting is applied and it shows 10-07-23
The dates from the SQL query are all selected using: DateFormat(exampl_date, '%Y-%m-%d') and they appear in the dataframe with data type Object
and then I use this function to create the excel workbook:
def format_excel(dataframe, worksheet):
## Creates new sheet
dataframe.to_excel(writer, sheet_name=worksheet, startrow=3, header=False, index=False)
workbook = writer.book
worksheet = writer.sheets[worksheet]
## Various cell formats to be used later
default_format = workbook.add_format({
'font_name': 'Arial',
'font_size': '10'})
heading_format = workbook.add_format({
'text_wrap': True,
'bold': True,
'valign': 'top',
'font_color': '#4682B4',
'font_name': 'Courier New',
'font_size': '12'})
date_format = workbook.add_format({
'num_format': '[$-en-US,1]dd-mm-yy'})
##Applies default format to all cells and sizes columns
worksheet.set_column('A:B', 23, default_format)
worksheet.set_column('C:C', 24, default_format)
worksheet.set_column('D:D', 28, default_format)
worksheet.set_column('E:M', 19, default_format)
##Sizes the first three rows
worksheet.set_row(0, 30)
worksheet.set_row(1, 1)
worksheet.set_row(2, 25)
## adds the heading row
run_day_formatted = run_day.strftime("%d/%m/%Y")
worksheet.merge_range('A1:F1', 'sometext ' + run_day_formatted + ' and ' + run_day_formatted, heading_format)
##Adds the date formatting
worksheet.set_column('F:M', 19, date_format)
## adds and formats the column names
header_format = workbook.add_format({
'bold': True,
'text_wrap': True,
'valign': 'top',
'fg_color': '#C0C0C0'})
for col_num, value in enumerate(dataframe.columns.values):
worksheet.write(2, col_num, value, header_format)
# Open XLSX Writer and create workbook
writer = pd.ExcelWriter(file_name(), engine='xlsxwriter')
# Uses above function to format the excel sheet
format_excel(df_results, 'some name')
# Closes writer and generates excel
writer.close()
print('Excel created')
I think the solution is adding a step that converts the date fields in the dataframe to a date datatype of some description but nothing seems to be working
Edit:
date_columns = [The Date Columns]
for column in date_columns:
df_results[column] = pd.to_datetime(df_results[column], format='%Y-%m-%d')
This causes the date value in excel to be 2023-07-05 00:00:00 instead of the intended 2023-07-05 and the custom formatting from xlsx writer is completely overwitten by yyyy-mm-dd hh:mm:ss instead of the intended [$-en-US,1]dd-mm-yy
Edit 2: This is what the excel is intended to look like: 1
Notice the value in the bar and the cell are different and the custom format string matches the one from xlsx writer.
I'm trying to get this file to reconcile exactly with an old one which is why i'm trying to recreate this instead of just putting the date i want in the field.