0

pd.DataFrame has Date column (has datetime data type) with style such as: %dd.%mm.%yyyy #for example 03.08.2020 but after INSERT to SQL Server df.to_sql(con_param) result in RDB looks like 08.03.2020 instead 03.08.2020. Problem is not linear, cause for example 31.07.2020 looks like 31.07.2020. Local of datebase is SQL_Latin1_General_CP1_CI_AS. What is wrong? UPD to sql block fyi

correct_df = pd.read_csv(path)
correct_df = correct_df.astype({
                           'Req_Date': 'datetime64', 'Date': 'datetime64'})
                           
 correct_df.to_sql(con=conn, name='table', schema='schema', if_exists='append', index=False)

khelwood
  • 55,782
  • 14
  • 81
  • 108
Tyomik_mnemonic
  • 786
  • 3
  • 9
  • 31
  • 1
    Dates have no style or format, they are binary values. Formats apply **only** if dates get converted to strings or vice versa. Your code is converting dates to strings somewhere - or there's no problem at all and your client tool displays dates in your locale - German? Russian? – Panagiotis Kanavos Sep 30 '20 at 14:22
  • None of what you actually state here actually makes a lot of sense. Date and time data types (in SQL Server) don't have *any* formats; they are stored as binary values. If you are inserting a date and time value as a string (that isn't unambiguous) and then getting a different value and/or format in the **presentation layer** that has nothing to do SQL Server. The problem is the ambiguous format used to insert the data, and the format choice in the presentation layer. – Thom A Sep 30 '20 at 14:23
  • What tool did you use to check the dates? Why do you assume there's a problem to begin with? Have you tried using SSMS on Windows or Azure Data Studio, or any other database management tool to query the data? – Panagiotis Kanavos Sep 30 '20 at 14:24
  • When inserting date(time)s, ideally use a date and time data type from the application layer to the SQL. If not, then use an unambiguous string format; there are only 2 in SQL Server regardless of data type and language: `yyyyMMdd` and `yyyy-MM-ddThh:mm:ss.nnnnnnn`. As for the format displayed when you retrieve the data, that is entirely up to the presentation layer, so you would need to look at the application settings/code. – Thom A Sep 30 '20 at 14:24
  • Python and Pandas have date types, and again, those have no format or style. Are you sure Pandas contains dates? Perhaps it contains strings instead? – Panagiotis Kanavos Sep 30 '20 at 14:26
  • Thanks for info. @PanagiotisKanavos Yes, I sure, that all cells in pd.DateFrime['Date'] have datetime64 data type. I check it and I cast it again before to_sql – Tyomik_mnemonic Sep 30 '20 at 14:51
  • You shouldn't have to cast anything. `df.to_sql` should be able to write dates as dates, not strings, unless there's a bug. Please post code that reproduces the problem – Panagiotis Kanavos Sep 30 '20 at 15:00
  • @PanagiotisKanavos Yes. I set in on ```pd.DataFrame``` level. So if cell can't to datetime, console throws error. – Tyomik_mnemonic Sep 30 '20 at 15:17
  • I think problem can be in collation diffrience between server and client – Tyomik_mnemonic Sep 30 '20 at 15:19
  • Dates aren't strings, they have no collation or codepage. They are binary values. You still haven't posted something that demonstrates the problem. How are you sure the incorrect conversion wasn't caused by `astype` parsing the file using a different date format than the one used in the CSV file? – Panagiotis Kanavos Sep 30 '20 at 15:37
  • 1
    You should probably use [pd.to_datetime](https://stackoverflow.com/questions/17134716/convert-dataframe-column-type-from-string-to-datetime-dd-mm-yyyy-format) with a specific format when parsing the CSV's date fields – Panagiotis Kanavos Sep 30 '20 at 15:40
  • Does this answer your question? [Convert DataFrame column type from string to datetime, dd/mm/yyyy format](https://stackoverflow.com/questions/17134716/convert-dataframe-column-type-from-string-to-datetime-dd-mm-yyyy-format) – Panagiotis Kanavos Sep 30 '20 at 15:41
  • @PanagiotisKanavos no. Cause my DataFrame['Date'] cells are datetime64. I check it by pd.to_datetime() too. – Tyomik_mnemonic Sep 30 '20 at 18:37
  • 1
    Are you sure? Are you 100% sure your user locale matches the format of the date strings in the CSV file? Have you tried `to_datetime` with an explicit format? You haven't posted any sample data yet, so a format mismatch is the most likely explanation. What do the dates in the CSV file look like? – Panagiotis Kanavos Sep 30 '20 at 18:41
  • 1
    `Cause my DataFrame['Date'] cells are datetime64` that doesn't mean much if the *wrong* format was used for parsing – Panagiotis Kanavos Sep 30 '20 at 18:42
  • ok. U say, that problem with locale matches between csv and pandas datetime, yes? visually it is looks eqally (03.08.2020 in csv and 03.08.2020 in pandas). So how I can check my user locale format of the date strings in the CSV file? – Tyomik_mnemonic Sep 30 '20 at 18:51

1 Answers1

1

So, thanks to Panagiotis Kanavos messages. If you face with a problem like that, you have to set format of the date string of csv in your DateFrime. For example

df['Date'] = pd.to_datetime(df['Date'], format='%d.%m.%Y') #format=f'{your_date_style}'

It can works. It works for me.

Tyomik_mnemonic
  • 786
  • 3
  • 9
  • 31