0

Going nuts trying to update a column of time entries in a dataframe. I am opening a csv file that has a column of time entries in UTC. I can take these times, convert them to Alaska Standard time, and print that new time out just fine. But when I attempt to put the time back into the dataframe, while I get no errors, I also don't get the new time in the dataframe. The old UTC time is retained. Code is below, I'm curious what it is I am missing. Is there something special about times?

import glob
import os
import pandas as pd
from datetime import datetime
from statistics import mean

def main():
    AKST = 9
    allDirectories = os.listdir('c:\\MyDir\\')
    for directory in allDirectories:
        curDirectory = directory.capitalize()
        print('Gathering data from: ' + curDirectory)
        dirPath = 'c:\\MyDir\\' + directory + '\\*.csv'
        # Files are named by date, so sorting by name gives us a proper date order
        files = sorted(glob.glob(dirPath))
        df = pd.DataFrame()
        for i in range(0,len(files)):
            data = pd.read_csv(files[i], usecols=['UTCDateTime', 'current_humidity', 'pm2_5_cf_1', 'pm2_5_cf_1_b'])
            dfTemp = pd.DataFrame(data) # Temporary dataframe to hold our new info
            df = pd.concat([df, dfTemp], axis=0) # Add new info to end of dataFrame
        print("Converting UTC to AKST, this may take a moment.")
        for index, row in df.iterrows():
            convertedDT = datetime.strptime(row['UTCDateTime'], '%Y/%m/%dT%H:%M:%Sz') - pd.DateOffset(hours=AKST)
            print("UTC: " + row['UTCDateTime'])
            df.at[index,'UTCDateTime'] = convertedDT
            print("AKST: " + str(convertedDT))
            print("row['UTCDateTime] = " + row['UTCDateTime'] + '\n') # Should be updated with AKST, but is not!

Edit - Alternatively: Is there a way to go about converting the date when it is first read in to the dataframe? Seems like that would be faster than having two for loops.

Roger Asbury
  • 351
  • 1
  • 4
  • 13
  • 1
    Please provide a [minimal reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) :) –  Mar 19 '22 at 01:21

1 Answers1

1

From your code, it looks like the data is getting updated correctly in the dataframe, but you are printing the row, which is not updated, as it was fetched from dataframe before its updation!

 #You are updating df
 df.at[index,'UTCDateTime'] = convertedDT #You are updating df
 # below you are printing row
 print("row['UTCDateTime] = " + row['UTCDateTime']

See sample code below and its output for the explanation.

data=pd.DataFrame({'Year':  [1982,1983], 'Statut':['Yes',  'No']})
for index, row in data.iterrows():
    data.at[index, 'Year'] = '5000' + str(index)
    print('Printing row which is unchanged : ', row['Year'])
print('Updated Dataframe\n',data)

Output

Printing row which is unchanged :  1982
Printing row which is unchanged :  1983
Updated Dataframe
     Year Statut
0  50000    Yes
1  50001     No
Manjunath K Mayya
  • 1,078
  • 1
  • 11
  • 20