Wrong date associated with early morning times scraped from weather website (Python) + only january data?

Question

Everytime I am trying to run the code below, the output file will show incorrect date (previous day for the 12AM-1:00 time) in the column on the very right. Is there a way around this - a snippet I could insert in the code that would prevent it from doing that? Thank you for your advice.

    import pandas as pd
    import datetime as dt

startDt = dt.datetime(2012,1,1)
endDt = dt.datetime.now()

#columns for dataframes
ListOfCol = ['TimeCET',
        'TemperatureC',  
        'Dew PointC', 
        'Humidity', 
        'Sea Level PressurehPa', 
        'VisibilityKm', 
        'Wind Direction',
        'Wind SpeedKm/h',
        'Gust SpeedKm/h',
        'PrecipitationCm',
        'Events',
        'Conditions',
        'WindDirDegrees',
        'Day'
        ]

for year in range(startDt.year,endDt.year+1):
    for month in range(startDt.month,13):
            if year < endDt.year: #means any remaining (future) days and months in the current year aren't included
                url =      'http://www.wunderground.com/history/airport/LZIB/{:d}/{:d}/1/DailyHistory.html?format=1'.format(year,month)
            elif month <= endDt.month: #means any remaining (future) days and     months in the current year aren't included
                url = 'http://www.wunderground.com/history/airport/LZIB/{:d}/{:d}/1/DailyHistory.html?format=1'.format(year,month)
            else: #if current year and past current month leave as is
                break
            if year == startDt.year and month == startDt.month: #if first date for LZIB Airport create dataframe
                BlavaDataFrame = pd.read_csv(url,comment='<',skiprows=1)
            BlavaDataFrame.columns = ListOfCol
            else: #if NOT first date for LZIB Airport append to dataframe to  make long list of all data organized by date
                BlavaDataFrameTEMP = pd.read_csv(url,comment='<',skiprows=1)
                BlavaDataFrameTEMP.columns = ListOfCol
                BlavaDataFrame =     BlavaDataFrame.append(BlavaDataFrameTEMP,ignore_index=True)
    BlavaDataFrame.to_csv('./LZIBhrs.csv')
    print('Finished writing ./LZIBhrs.csv to disk')

the obvious first thing to check is what timezone the source uses and whether it is different from your local timezone. — jfs, Feb 25 '15 at 15:43

Wrong date associated with early morning times scraped from weather website (Python) + only january data?

0 Answers0

Linked