Format dates in Excel for comparison

Question

I'm in the midst of writing a iPython notebook that will pull the contents of a .csv file and paste them into a specified tab on an .xlsx file. The tab on the .xlsx is filled with a bunch of pre-programmed formulas so that I might run an analysis on the original content of the .csv file.

I've ran into a snag, however, with the the date fields that I copy over from the .csv into the .xlsx file.

The dates do not get properly processed by the Excel formulas unless I double-click the date cells or apply Excel's "text to columns" function on the column of dates and set a tab as the delimiter (which I should note, does not split the cell).

I'm wondering if there's a way to either...

write a helper function that logs the keystrokes of applying the "text to columns" function call

write a helper function to double click and return down each row of the column of dates

from openpyxl import load_workbook
import pandas as pd

def transfer_hours(report_name, ER_hours_analysis_wb):

    df = pd.read_csv(report_name, index_col=0)

    book = load_workbook(ER_hours_analysis_wb)
    sheet_name = "ER Work Log"

    with pd.ExcelWriter("ER Hours Analysis 248112.xlsx", 
        engine='openpyxl')  as writer:

        writer.book = book
        writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

        df.to_excel(writer, sheet_name=sheet_name, 
            startrow=1, startcol=0, engine='openpyxl')

you may also want to consider openpyxl that is built to interact with excel specifically https://openpyxl.readthedocs.io/en/stable/index.html — brddawg, Oct 31 '18 at 22:10
writing this directly into excel will likely be easier. working on an example — brddawg, Oct 31 '18 at 22:24
if i'm understanding this correctly you're combining the csv with an excel file? — brddawg, Oct 31 '18 at 22:50
Right, I'm essentially copying over the contents of .csv to a specific tab of a .xlsx file — Coleman, Oct 31 '18 at 23:57

brddawg · Answer 1 · 2018-11-01T18:04:45.283

As great a tool as pandas is designed to be, in this case there may not be a reason to include.

Here is a shorter structure for what you're trying to accomplish:

import csv
import datetime
from openpyxl import load_workbook

def transfer_hours(report_name, ER_hours_analysis_wb):
    wb = load_workbook(ER_hours_analysis_wb)
    ws = wb['ER Work Log'] 

    csvfile = open(report_name, 'rt')
    reader = csv.reader(csvfile,delimiter=',')

    #iterators
    rownum = 0
    colnum = 0

    for row in reader:       
        for col in row:
            dttm = datetime.datetime.strptime(col, "%m/%d/%Y")
            ws.cell(column=colnum,row=rownum).value = dttm

    wb.save('new_spreadsheet.xlsx')

What you'll be able to do from here is break out which columns should have what format based on the position in the csv. Here is an example:

    for row in reader:       
        ws.cell(column=0,row=rownum,value=row[0])
        dttm = datetime.datetime.strptime(row[1], "%m/%d/%Y")
        ws.cell(column=1,row=rownum).value = dttm

For reference:

https://openpyxl.readthedocs.io/en/stable/usage.html

In Python, how do I read a file line-by-line into a list?

How to format columns with headers using OpenPyXL

I'm getting a error message that Style is undefined 'NameError: name 'Style' is not defined' — Coleman, Nov 01 '18 at 15:14
Getting an AttributeError: 'Workbook' object has no attribute 'save' — Coleman, Nov 01 '18 at 17:19

score 0 · Answer 2 · answered Nov 01 '18 at 07:47

0

Use the xlsx module

import xlsx
load_workbook  ( filen = (filePath,  read_only=False, data_only=False )

Setting data_only to False will return the formulas whereas data_only=True returns the non-formula values.

answered Nov 01 '18 at 07:47

RasikhJ

601
5
10

Format dates in Excel for comparison

2 Answers2