1

I use SqlAlchemy to query an Oracle database and store results in csv files.

I would like to specify a global format for dates to be written like this :

'DD-MM-YYYY HH24:MI:SS'.

I have set NLS_DATE_FORMAT this way on the system.

For exemple :

datetime.datetime(2016, 12, 22, 13, 12, 35)

Would end up :

2004-12-22 13:12:35

I would like :

22-12-2004 13:12:35

As I process hundreds of tables, I cannot apply 'strftime' "manually".

stockersky
  • 1,531
  • 2
  • 20
  • 36
  • Can you provide a sample of the code you are sending to the Oracle database via SqlAlchemy? Is it an option for you to handle the conversion as part of your Oracle query, instead of within your Python code? A hint for how to do this can be found here: https://stackoverflow.com/questions/26731319/datetime-on-where-clause-oracle?rq=1 – Projski Sep 26 '16 at 18:19
  • I get the sql statement from another module. Then execute it (result_proxy = connection.execute(request)) . There is no formatting in the sql statement. It just selects fields. I tried to execute : alter session set NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS' after connecting to the database but it does not change anything. – stockersky Sep 27 '16 at 09:28
  • As I have a huge amount of tables, I cannot make custom code for each of them. It more like a select * from every tables in an oracle schema and dump each table un a distinct csv. – stockersky Sep 27 '16 at 09:30

2 Answers2

0

I found a way to solve this issue.

Yes, converting date to char with the appropriate formatting would work. But, in my case, SQL statement are provided by another module and I need to process more than a hundred tables.

So, I decided to work with datas contained in the ResultProxy object returned by SqlAlchemy by the execute() method.

I fetch a table by chunks of 1000 rows (chunk is classic type list) at a time. But those rows are kind of tuples (more precisely an SqlAlchemy RowProxy object) and it cannot be modified.

So, I had a treatment to cast them in ordered dictionnaries and update the chunk list.

It's important to use 'collections.OrderedDict' because it keeps fields order. With a classical dic, then field labels and values might not match.

Now, my chunk is ready for all kind of treatments (change Dates to strings with appropriate formatting, substitute char in VARCHAR strings, etc and so on...). Dictionary structure is perfect for this.

Note, that before writing, OrderedDic rows in the chunk list have to be cast back.

Here is a simplified example :

result_proxy = connection.execute(request)
while True:
    chunk = self.result_proxy.fetchmany(1000)
    if not chunk:
        break
    # treatments comes here after :

    # 1- transform into a dic in order to be able to modify
    for i, row in enumerate(chunk):
        chunk[i] = OrderedDict(row)

    # 2- clean dates
    for i, row_odic in enumerate(chunk):
        for item in row_odic:
            if(type(row_odic[item]) is datetime.datetime):
                row_odic[item] = str(row_odic[item].strftime("%d/%m/%Y"))
        chunk[i] = row_odic

    # Other data treatment

    # cast back for it to look like a classical result :
    for c, row_odic in enumerate(chunk):
        self.chunk[c] = row_odic.values()

    # finally write row_odic.values in the csv file

I am not sure if it's the most efficient solution but performances look good. I have a version of this treatment (same volume of datas), but using the Pandas library, that is a bit longer to execute.

stockersky
  • 1,531
  • 2
  • 20
  • 36
0

maybe I have the similar purpose as you.

I just want print(row) has formatted datetime,
rather than datatime.datetime(y,m,d.. things.

I have tried several ways (about handle datetime):

  • sql level convert use func date_format // sql dialects related
    SELECT *, date_format(last_login, '%m-%d-%Y') FROM devices;
    // func.date_format, func.to_char
  • convert RowProxy to normal dict //different access style
  • a custom JSON.encoder //only for json stringify
    https://stackoverflow.com/a/22238613/4896468
  • inherit RowProxy //less invasive I think

just explain the inherit way here:

① the result

orig row

>>> type(row)
<class 'aiomysql.sa.result.RowProxy'>

>>> row
(4982, '203233D81914', False, datetime.datetime(2021, 6, 28, 19, 30, 6))

>>> row.last_login
datetime.datetime(2021, 6, 28, 19, 30, 6)

my row

>>> my_row = my_proxy(row)
>>> my_row
(4982, '203233D81914', False, '2021-06-28 19:30:06')

>>> my_row.last_login
'2021-06-28 19:30:06'

② the code

inherit RowProxy, and override (any of them on demand)

  • __repr__ (for print)
  • __getattr__ (for .xxx fileds access)
from datetime import datetime

class my_proxy(proxy_type):
  # refer orig RowProxy implement
  def __init__(self, row):
    self._result_proxy = row._result_proxy
    self._row = row._row
    self._processors = row._processors
    self._keymap = row._keymap
  # for print
  def __repr__(self): 
    # return ' xx:'+ repr(self.as_tuple())
    return repr(tuple(v.strftime('%F %T') if type(v)==datetime else v for v in self.as_tuple()))
  # for field access
  def __getattr__(self, name):
    try:
      # return self[name]
      v = self[name]
      return v.strftime('%F %T') if type(v)==datetime else v
    except KeyError as e:
      raise AttributeError(e.args[0])

proxy_type = type(row)

which is aiomysql.sa.result.RowProxy

code based on

  • aiomysql 0.0.22
  • SQLAlchemy 1.4.31

especially the __init__ function:
 github:aiomysql/sa/result.py#L22

different version may differ.

yurenchen
  • 1,897
  • 19
  • 17