How do I timestamp an excel document using Pandas?

Question

I have a script that uses FFMPEG and CMD to cut video files based off of an excel document row by row. I would like python to add a timestamp after it is done with a row. Can you guys please help?

import subprocess as sp, pandas as pd

ffmpeg = 'C:/FFMPEG/bin/ffmpeg.exe' # on Windows
datafile = r'C:\Users\A_Do\Dropbox\1. Projects\2. Python\TM Creator\tm_creator_test1.xlsx'

xl = pd.ExcelFile(datafile,index = False)
df = xl.parse('Sheet1')

def create_tm():
    row_iterator = df.iterrows()
     # take first item from row_iterator
    for i, row in row_iterator:
        infile = row['filename']
        outputfile = row['outputfilename']
        timein = row['timein']
        duration = row['duration']
        decision = row['Create TM?']
        if decision == "Y":
            sp.call(ffmpeg + " -y -i " + infile + " -map 0:0 -map 0:1 -map 0:2 -acodec copy -ss " + str(timein) + " -codec copy -t " + str(duration) + " " + outputfile,shell=True) #this works

        elif decision != decision: #this gets rid of the NaN
            break
        else:
            print "You said you didn't want to make a TM for " + str(infile)

create_tm()

Thanks!

My final code:

import subprocess as sp, pandas as pd
# (1) new import
from openpyxl import load_workbook
# (2) new import
from datetime import datetime

ffmpeg = 'D:/FFMPEG/bin/ffmpeg.exe' # on Windows
datafile = r'D:\Dropbox\1. Projects\2. Python\TM Creator\tm_creator_test1.xlsx'

# (3) open the file in openpyxl first:
book = load_workbook(datafile)

xl = pd.ExcelFile(datafile,index = False)
df = xl.parse('Sheet1')

def create_tm():
    row_iterator = df.iterrows()
     # take first item from row_iterator
    for i, row in row_iterator:
        infile = row['filename']
        outputfile = row['outputfilename']
        timein = row['timein']
        duration = row['duration']
        decision = row['Create TM?']
        if decision == "Y":
            sp.call(ffmpeg + " -y -i " + infile + " -map 0:0 -map 0:1 -acodec copy -ss " + str(timein) + " -codec copy -t " + str(duration) + " " + outputfile,shell=True) #this works
            # (4) Wherever in the code you want to put the timestamp:
            df.loc[i, 'Timestamp'] = str(datetime.now())
            # (5) This saves the sheet back into the original file, without removing
            # any of the old sheets.
            writer = pd.ExcelWriter(datafile)
            writer.book = book
            writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
            df.to_excel(writer, index=False)
            writer.save()
        elif decision != decision: #this gets rid of the NaN
            break
        else:
            print "You said you didn't want to make a TM for " + str(infile)

brenns10 · Accepted Answer · 2015-07-16T04:02:52.740

0

Applying the answer from this SO question, here is an updated version of your code, with comments where I made changes:

import subprocess as sp, pandas as pd
# (1) new import
from openpyxl import load_workbook
# (2) new import
from datetime import datetime

ffmpeg = 'C:/FFMPEG/bin/ffmpeg.exe' # on Windows
datafile = r'C:\Users\A_Do\Dropbox\1. Projects\2. Python\TM Creator\tm_creator_test1.xlsx'

# (3) open the file in openpyxl first:
book = load_workbook(datafile)

xl = pd.ExcelFile(datafile,index = False)
df = xl.parse('Sheet1')

def create_tm():
    row_iterator = df.iterrows()
     # take first item from row_iterator
    for i, row in row_iterator:
        infile = row['filename']
        outputfile = row['outputfilename']
        timein = row['timein']
        duration = row['duration']
        decision = row['Create TM?']
        if decision == "Y":
            sp.call(ffmpeg + " -y -i " + infile + " -map 0:0 -map 0:1 -map 0:2 -acodec copy -ss " + str(timein) + " -codec copy -t " + str(duration) + " " + outputfile,shell=True) #this works

        elif decision != decision: #this gets rid of the NaN
            break
        else:
            print "You said you didn't want to make a TM for " + str(infile)

        # (4) Wherever in the code you want to put the timestamp:
        df.loc[i, 'Timestamp'] = str(datetime.now())

    # (5) This saves the sheet back into the original file, without removing
    # any of the old sheets.
    writer = pd.ExcelWriter(datafile)
    writer.book = book
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
    df.to_excel(writer)
    writer.save()

create_tm()

If you'd like more control over the format of your timestamp, consult the docs for datetime.datetime.strftime(). Also, I wrote this assuming that your Excel file had other sheets that you would be upset if you lost. If that is not the case, you don't have to do nearly as much. You can ignore changes numbered 1 and 3, and replace 5 with df.to_excel(datafile).

edited Jul 16 '15 at 04:02

answered Jul 16 '15 at 00:10

brenns10

3,109
3
22
24

Thank you for the reply! You were correct that I had other sheets. I am getting a TypeError: copy() got an unexpected keyword argument 'font' [Finished in 0.4s with exit code 1] Can you help me get rid of it? – Andy Do Jul 16 '15 at 01:39
I had the same error when testing the code. There seems to be a bug in Pandas with OpenPyXL version 2.2. If you install OpenPyXL version 2.1 instead, it works fine. `pip install openpyxl==2.1` worked for me, hopefully it'll work on Windows too! – brenns10 Jul 16 '15 at 01:47
Here's the link to the [GitHub Issue for Pandas about it](https://github.com/pydata/pandas/issues/10125) – brenns10 Jul 16 '15 at 01:48
Apparently, openpyxl hasn't had a very steady API since they released 2.0, so Pandas can't really keep up with it. So maybe you'll want to go back to 2.0 (but I had no problems with 2.1). – brenns10 Jul 16 '15 at 01:49
Thanks! This solved the issue! I am a complete newb! where would you put your line to write the code? I try to put it on line 29 but it causes the sheets to get messed up because it adds another index column. – Andy Do Jul 16 '15 at 01:58
I'm not sure what you mean, "the line to write the code". If you mean, where do you put the line `df.loc[i, 'Timestamp'] = str(datetime.now())`, it depends on whether you want to have a timestamp only after you've called ffmpeg. If that's so, I would put it after the `sp.call(...)` line, inside the if statement. Is that what you're asking? – brenns10 Jul 16 '15 at 02:02
Yes I put it after sp. I believe what is happening is that it is copying the index from the dataframe and shifting my columns in the excelfile to the right. – Andy Do Jul 16 '15 at 02:10
Oh! You can probably add `index=False` to the call to `df.to_excel()`. According to [this](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_excel.html) documentation page. – brenns10 Jul 16 '15 at 02:14
Thanks so much! that got rid of the index. However, the timestamp won't show at all. I'm trying to get it in the Timestamp column at the appropriate row. – Andy Do Jul 16 '15 at 02:20
Did you want a timestamp only for the times that you call ffmpeg? Or for every row of the spreadsheet? Cause you might have to move the line labeled `(4)`. Do you see a "Timestamp" column? – brenns10 Jul 16 '15 at 02:25
II want the timestamp everytime FFMPEG runs. I moved line (4) under the if statement after the sp call. I have a designated Timestamp column. – Andy Do Jul 16 '15 at 02:27
Huh, I really don't know why this would be happening. Maybe open a new question for that issue, this thread is getting pretty long! – brenns10 Jul 16 '15 at 02:29
I figured it out. I need to put the writer portion inside the loop! DOH! thank you for your help!!! – Andy Do Jul 16 '15 at 03:54
Oh damn, I didn't even put that portion within the function!! That explains a whole lot! It doesn't need to be within the loop, but it does need to be within the function. I'll edit my answer. – brenns10 Jul 16 '15 at 04:02

How do I timestamp an excel document using Pandas?

1 Answers1