2

After I read a file and parse the dates the date format changes within the dataframe. Find below my code.

v2x = r'E:\Model\Data\v2x.csv'
outfile = r'E:\Model\ModelSpecific\Input_shat2.txt'

df_data = pd.read_csv(file_name,parse_dates=[0], index_col=0)
df_v2x = pd.read_csv(v2x, parse_dates=[0], sep=",")

print(df_v2x[4800:5000])

The correct format should be '%y-%m-%d'

And the print output:

4988 2018-07-08   V2TX     12.6265 --> Wrong Format
4989 2018-08-08   V2TX     12.8654
4990 2018-09-08   V2TX     12.4882
4991 2018-10-08   V2TX     15.1113 
4992 2018-08-13   V2TX     15.9406 --> Right Format
4993 2018-08-14   V2TX     15.8610
4994 2018-08-15   V2TX     18.4755
4995 2018-08-16   V2TX     16.2633

Thanks for your help!

Maeaex1
  • 703
  • 7
  • 36
  • Hi @roganjosh, after reading the file the correct format should be %y-%m-%d (I guess this is the default format for pandas), the input file has the date format dd/mm/yyyy. Somehow it doesn't read the dates correctly, so the date structure changes within dataframe which messes up my followed calculations ... – Maeaex1 Sep 22 '18 at 16:20
  • Does this help? https://stackoverflow.com/questions/17465045/can-pandas-automatically-recognize-dates – Bharath M Shetty Sep 22 '18 at 17:14
  • @Dark, thanks for the hint. I implemented the suggested dateparser `dateparse = lambda x: pd.datetime.strptime(x, '%d/%m/%d')` But it returns: ValueError: unconverted data remains: 99 – Maeaex1 Sep 22 '18 at 17:26
  • How about not parsing via read_csv? Just read the file and convert that column to datetime using `to_datetime` with `errors = 'coerce'`. – Bharath M Shetty Sep 22 '18 at 17:30
  • @Dark tried it that way but same outcome like before... keep changing the format. If you wanna check the data then you can find it here [vstoxx](https://www.stoxx.com/document/Indices/Current/HistoricalData/h_v2tx.txt) Thanks for your help! – Maeaex1 Sep 22 '18 at 17:50

1 Answers1

0

Found a solution with "annoying" workaround (extracting strings of day, month and year).

v2x = r'E:\Model\Data\v2x.csv'
outfile = r'E:\Model\ModelSpecific\Input_shat2.txt'

data = pd.read_csv(v2x, sep=",")

data['Year'] = data['Date'].str.slice(6, 10)  #redo the index because of american timestamp
data['Month'] = data['Date'].str.slice(3,5) 
data['Day'] = data['Date'].str.slice(0,2)
datetime = pd.to_datetime(data[['Year','Month','Day']])
data = data.drop(['Date','Year','Month','Day'],axis=1)
data = pd.concat((datetime,data),axis=1)
data = data.rename({0:'Date'},axis=1)
data = data.set_index('Date')
Maeaex1
  • 703
  • 7
  • 36