0
import pandas as pd
import sys
df = pd.read_csv(sys.stdin, sep='\t', parse_dates=['Date'], index_col=0)
df.to_csv(sys.stdout, sep='\t')
Date    Open
2020/06/15  182.809924
2021/06/14  257.899994

I got the following output with the input shown above.

Date    Open
2020-06-15  182.809924
2021-06-14  257.899994

The date format is changed. Is there a way to maintain the date format automatically? (For example, if the input is in YYYY/MM/DD format, the output should be in YYYY-MM-DD. If the input is in YYYY-MM-DD, the output should in YYYY-MM-DD, etc.)

I prefer a way that I don't have to manually test the data format. It is best if there is an automatical way to maintain the date format, no matter what the particular date format is.

user1424739
  • 11,937
  • 17
  • 63
  • 152
  • 2
    `2020/06/15` is of dtype `string` but `2020-06-15` is of dtype `datetime[ns]` since you are using parse_date parameter so `Date` column is converted to datetime you can change the format to string anytime by using `strftime()` method `df['Date'].dt.strftime('%Y/%m/%d')` – Anurag Dabas Jun 19 '21 at 05:30
  • 1
    You are confusing two things here. The string format in your file and the representation pandas uses to print the date after it has been parsed. One has nothing to do with the other – MaxNoe Jun 19 '21 at 05:32
  • https://xkcd.com/1179/ – MaxNoe Jun 19 '21 at 05:33
  • Why YYYY-MM-DD defined by the ISO is better than other format? It seems that if the ISO defined YYYY/MM/DD, it would work equally well. – user1424739 Jun 19 '21 at 05:41
  • The forum answers above covers your question. There is a difference between what the datetime formatting is for machine and human. If I where you, I would not care the style but only when I have to share date to humans, e.g. `to_csv` or using `.style` for display – Prayson W. Daniel Jun 19 '21 at 05:45
  • Please remove the false assumptions from your question. **Nothing at all is broken.** You don't have to "manually test the data format". The date "format" isn't changed, only the separator (from '/' to '-'), and that only for internal display, and anyway you can configure that when you export it with `.to_csv(..., date_format)`. There is no issue here. – smci Jun 19 '21 at 05:53

2 Answers2

3

You can specify the date_format argument in to_csv:

df.to_csv(sys.stdout, sep='\t', date_format="%Y/%m/%d")
Andrew Eckart
  • 1,618
  • 9
  • 15
  • I don't want to manually specify a specific dateformat in my code as the input can be any dateformat. – user1424739 Jun 19 '21 at 05:32
  • Then your best bet is to not parse the dates in the first place, as that will preserve the string format. You can create a second column with `df["parsed_date"] = pd.to_datetime(df["Date"])`, use that for calculations, and then exclude it in `to_csv` by passing `columns=[col for col in df.columns if col != "parsed_date"]`. – Andrew Eckart Jun 19 '21 at 05:35
0

Keep the dates as strings and parse them into an extra column if you need to operate on them as dates?

df = pd.read_csv(sys.stdin, sep='\t', index_col=0)
df['DateParsed'] = pd.to_datetime(df["Date"])
jnnnnn
  • 3,889
  • 32
  • 37