3

The background:

1, used Python to execute the SQL and saved the output into DataFrame format using Pandas.

2, appended the output to my existing excel file as one new worksheet.

Below is my code:

from pandas import ExcelWriter

sql_20 = ''''''

db = cx_Oracle.connect('*****', '*******', '**********')
conn = db.cursor()

conn.execute(sql_20)
df = pandas.read_sql_query(sql_20,db)
print(df)

with ExcelWriter('GUCS6J-Job Data.xlsx', mode='a') as writer:
    df.to_excel(writer, sheet_name=str(20))
writer.save()

Database: Oracle

Column A data type: date

The SQL result from Oracle:

enter image description here

The DataFrame print from Python:

enter image description here

So far so good


But what I got in the excel worksheet:

enter image description here

enter image description here

I am curious why I got the '##################' and the value of this cell is -693594 in the excel.

I hope I can get the same result as the DataFrame print in the excel file.

Gen Wan
  • 1,979
  • 2
  • 12
  • 19
  • whats the `dtype` for `df['A']` ? – Umar.H Oct 30 '19 at 22:11
  • 1
    Hi @Datanovice, it is . I used print(type(df['A'])) – Gen Wan Oct 30 '19 at 22:13
  • `pd.to_datetime('01-Jan-01')` returns `2001-01-01` cast your dates into a valid pandas date time `df['date'] = pd.to_datetime(df['date'])` tested on my machine and i can view in excel. – Umar.H Oct 30 '19 at 22:14
  • I'm just curious as to why you need ```from pandas import ExcelWriter```? You would have already imported pandas earlier(since you are creating a dataframe earlier) and you should be able to use ```df.to_excel``` without any additional imports. – NotAName Oct 30 '19 at 22:46
  • Hi @Datanovice, in our oracle database, '01-Jan-01' is 0001-01-01. Pandas reads the output correctly because the data frame shows '0001-01-01 00:00:00'. I will have hundreds of columns in the output and I do not want to cast every column manually. – Gen Wan Oct 30 '19 at 22:46
  • Hi @user2825403, because I want to [append] the output to the existing excel file. Do you know if there is any other approach? – Gen Wan Oct 30 '19 at 22:48
  • Since you simply want to add a new worksheet to the existing file I'd just do ```df.to_excel('GUCS6J-Job Data.xlsx', sheet_name=str(20))``` – NotAName Oct 30 '19 at 22:56
  • what datetime format is '0001-01-01' ? Year 0 is when the Georgian calendar started unless i'm missing something. You could `melt/unpivot` the data frame then cast your datetimes into a datetime then save to excel. You could tackle in many ways, depends on the shape of your data, is it just timesheet data with employee id and dates? – Umar.H Oct 30 '19 at 22:56
  • hi @user2825403, I just tried this. This approach will re-create the file and it only has one worksheet which is [20]. So I lost my existing tab... – Gen Wan Oct 30 '19 at 23:01
  • @Datanovice, I cannot cast every column manually because we will have hundreds of columns and hundreds of new tabs in the excel file. So the question here is: the second record [2014-09-20 00:00:00] works great in the excel file. Why the [0000-01-01 00:00:00] could not be loaded correctly? – Gen Wan Oct 30 '19 at 23:07

1 Answers1

0

please refer to these two links to see if it solves your problem https://xlsxwriter.readthedocs.io/example_pandas_datetime.html Python pandas dataframe "Date" Index different format in xlsx and csv

I thinks you need to add the date_format in you ExcelWriter()

Kevin
  • 1
  • Welcome to StackOverlfow! Please edit your answer to include an example of your linked pages' tips, as they apply to OP's code. Maybe rewrite their code, adding in your suggested changes? This will make your answer more useful to others who might have this issue later on. – Das_Geek Oct 31 '19 at 01:01
  • I think I found the causes https://stackoverflow.com/questions/32888124/pandas-out-of-bounds-nanosecond-timestamp-after-offset-rollforward-plus-adding-a – Kevin Oct 31 '19 at 18:29