109

How do I serialize pyodbc cursor output (from .fetchone, .fetchmany or .fetchall) as a Python dictionary?

I'm using bottlepy and need to return dict so it can return it as JSON.

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Foo Stack
  • 2,185
  • 7
  • 24
  • 25
  • And yes, I did notice this was in the [FAQ for PEPE249](http://www.python.org/dev/peps/pep-0249/#frequently-asked-questions), however that doesn't change my requirement. – Foo Stack May 13 '13 at 10:09

10 Answers10

216

If you don't know columns ahead of time, use Cursor.description to build a list of column names and zip with each row to produce a list of dictionaries. Example assumes connection and query are built:

>>> cursor = connection.cursor().execute(sql)
>>> columns = [column[0] for column in cursor.description]
>>> print(columns)
['name', 'create_date']
>>> results = []
>>> for row in cursor.fetchall():
...     results.append(dict(zip(columns, row)))
...
>>> print(results)
[{'create_date': datetime.datetime(2003, 4, 8, 9, 13, 36, 390000), 'name': u'master'},   
 {'create_date': datetime.datetime(2013, 1, 30, 12, 31, 40, 340000), 'name': u'tempdb'},
 {'create_date': datetime.datetime(2003, 4, 8, 9, 13, 36, 390000), 'name': u'model'},     
 {'create_date': datetime.datetime(2010, 4, 2, 17, 35, 8, 970000), 'name': u'msdb'}]
Bryan
  • 17,112
  • 7
  • 57
  • 80
  • 9
    did not know about `cursor.description`. this just saved me a boatload of time. – TehTris Mar 18 '15 at 23:49
  • needed to wrap parentheses around print (columns) and print (results) for this to work – LJT Sep 16 '15 at 06:46
  • This seems to work, but I can't access results like a dict. For example if i try to do something like results['columname'] I get an error like 'list indices must be integers, not str' though the output of 'print results' looks like yours does, like a dict! What gives? – Ben Lutgens Mar 21 '16 at 18:28
  • @BenLutgens Looks like a good opportunity for a new question. Be sure to include code to reproduce the error. – Bryan Mar 21 '16 at 18:43
  • 3
    @LJT Only in python3... but since the print() function works in python2, it's good practice to use it. – Auspex Feb 15 '17 at 11:33
  • 1
    @BenLutgens Because the example produces a _list_ of dicts, not a dict. – Auspex Feb 15 '17 at 11:34
  • This works fine. The only issue I have is that all the column names are lower case now and I'll have to change the html in order to display them correctly. – Weihui Guo Nov 29 '17 at 15:25
  • 1
    Update: by default, pypyodbc sets lowercase = True. You can overwrite this like this: import pypyodbc; pypyodbc.lowercase = False. Reference: [link](https://github.com/jiangwen365/pypyodbc/issues/23) – Weihui Guo Nov 29 '17 at 15:33
  • This is awesome. But whenI run this I get the data back as a dictionary (key, value pair) set of objects. One of the fields gets returned with it's type - not sure how to fix that - ```Decimal('1000.000')``` as I get an error now - ```Object of type Decimal is not JSON serializable``` – Loser Coder Sep 13 '18 at 17:55
  • Yeh. I have the same issue. How to convert Decimal object to decimal? – GLP Jan 30 '21 at 15:49
16

Using @Beargle's result with bottlepy, I was able to create this very concise query exposing endpoint:

@route('/api/query/<query_str>')
def query(query_str):
    cursor.execute(query_str)
    return {'results':
            [dict(zip([column[0] for column in cursor.description], row))
             for row in cursor.fetchall()]}
Foo Stack
  • 2,185
  • 7
  • 24
  • 25
8

For situations where the cursor is not available - for example, when the rows have been returned by some function call or inner method, you can still create a dictionary representation by using row.cursor_description

def row_to_dict(row):
    return dict(zip([t[0] for t in row.cursor_description], row))
Kevin Campbell
  • 629
  • 7
  • 5
7

Here is a short form version you might be able to use

>>> cursor.select("<your SQL here>")
>>> single_row = dict(zip(zip(*cursor.description)[0], cursor.fetchone()))
>>> multiple_rows = [dict(zip(zip(*cursor.description)[0], row)) for row in cursor.fetchall()]

As you might be aware when you add * to a list you basically strips away the list, leaving the individual list entries as parameters to the function you are calling. By using zip we pick the 1st to n entry and zip them together like a the zipper in you pants.

so by using

zip(*[(a,1,2),(b,1,2)])
# interpreted by python as zip((a,1,2),(b,1,2))

you get

[('a', 'b'), (1, 1), (2, 2)]

Since description is a tuple with tuples, where each tuple describes the header and the data type for each column, you can extract the first of each tuple with

>>> columns = zip(*cursor.description)[0]

equivalent to

>>> columns = [column[0] for column in cursor.description]
Tommy Strand
  • 1,384
  • 2
  • 14
  • 15
  • With python3.4 I get: `TypeError: 'zip' object is not subscriptable`, so I cannot use the `zip(*description)[0]` trick. – malat Oct 19 '15 at 09:02
  • 1
    In python 3.4, zip is an iterator. You can wrap the zip in a list list(zip(*description))[0] @malat – Tommy Strand Oct 19 '15 at 13:44
  • You saved one line with `columns` variable, but multiplied the function's complexity by calculating names of columns for each row separately – Sergey Nudnov Mar 09 '20 at 20:21
3

Mainly going off @Torxed response, I created a full generalised set of functions to find the schema and data into a dictionary:

def schema_dict(cursor):
    cursor.execute("SELECT sys.objects.name, sys.columns.name FROM sys.objects INNER JOIN sys.columns ON sys.objects.object_id = sys.columns. object_id WHERE sys.objects.type = 'U';")
    schema = {}

    for it in cursor.fetchall():
        if it[0] not in schema:
            schema[it[0]]={'scheme':[]}
        else:
            schema[it[0]]['scheme'].append(it[1])

    return schema


def populate_dict(cursor, schema):
    for i in schema.keys():
        cursor.execute("select * from {table};".format(table=i))

        for row in cursor.fetchall():
            colindex = 0

            for col in schema[i]['scheme']:
                if not 'data' in schema[i]:
                    schema[i]['data']=[]

                schema[i]['data'].append(row[colindex])
                colindex += 1

    return schema

def database_to_dict():
    cursor = connect()
    schema = populate_dict(cursor, schema_dict(cursor))

Feel free to go all code-golf on this to reduce the lines; but in the meantime, it works!

;)

Foo Stack
  • 2,185
  • 7
  • 24
  • 25
2

I like @bryan and @foo-stack answers. If you are working with postgresql and you are using psycopg2 you could use some goodies from psycopg2 to achieve the same by specifying the cursorfactory being a DictCursor when creating your cursor from the connection, like this:

cur = conn.cursor( cursor_factory=psycopg2.extras.DictCursor )

So now you can execute your sql query and you'll get a dictionary to fetch your results, without the need to map them by hand.

cur.execute( sql_query )
results = cur.fetchall()

for row in results:
    print row['row_no']

Please note that you'll have to import psycopg2.extras for that to work.

matthaeus
  • 797
  • 2
  • 7
  • 17
1

Assuming you know you column names! Also, here are three different solutions,
you probably want to look at the last one!

colnames = ['city', 'area', 'street']
data = {}

counter = 0
for row in x.fetchall():
    if not counter in data:
        data[counter] = {}

    colcounter = 0
    for colname in colnames:
        data[counter][colname] = row[colcounter]
        colcounter += 1

    counter += 1

That's an indexed version, not the most beautiful solution but it will work. Another would be to index the column name as dictionary key with a list within each key containing the data in order of row number. by doing:

colnames = ['city', 'area', 'street']
data = {}

for row in x.fetchall():
    colindex = 0
    for col in colnames:
        if not col in data:
            data[col] = []
        data[col].append(row[colindex])
        colindex += 1

Writing this, i understand that doing for col in colnames could be replaced by for colindex in range(0, len()) but you get the idea. The later example tho would be useful when not fetching all data, but one row at a time, for instance:

Using dict for each row of data

def fetchone_dict(stuff):
    colnames = ['city', 'area', 'street']
    data = {}

    for colindex in range(0, colnames):
        data[colnames[colindex]] = stuff[colindex]
    return data

row = x.fetchone()
print fetchone_dict(row)['city']

Getting tablenames (i think.. thanks to Foo Stack):
a more direct solution from beargle below!

cursor.execute("SELECT sys.objects.name, sys.columns.name FROM sys.objects INNER JOIN sys.columns ON sys.objects.object_id = sys.columns. object_id WHERE sys.objects.type = 'U';")
schema = {}
for it in cursor.fetchall():
    if it[0] in schema:
       schema[it[0]].append(it[1])
    else:
        schema[it[0]] = [it[1]]
Torxed
  • 22,866
  • 14
  • 82
  • 131
  • Thanks, but is there a generalised solution for when I don't know my column names? – Foo Stack May 13 '13 at 11:32
  • Yes, it's called SQL syntax. You can query your database for the names of the table you're querying against. http://stackoverflow.com/questions/4645456/get-table-names-from-a-database – Torxed May 13 '13 at 11:33
  • I have written a nice little generalised schema gatherer: – Foo Stack May 13 '13 at 12:44
  • 1
    `cursor.execute("SELECT sys.objects.name, sys.columns.name FROM sys.objects INNER JOIN sys.columns ON sys.objects.object_id = sys.columns. object_id WHERE sys.objects.type = 'U';") schema = {} for it in cursor.fetchall(): if it[0] in schema: schema[it[0]].append(it[1]) else: schema[it[0]] = [it[1]]` – Foo Stack May 13 '13 at 12:45
  • @FooStack Column names are already returned in [cursor.description](https://code.google.com/p/pyodbc/wiki/Cursor#description). A separate query isn't needed. – Bryan May 13 '13 at 14:24
1

What I needed, which is slightly different than what OP was asking for:
If you want to fully generalize a routine that performs SQL Select Queries, but you need to reference the results by an index number, not a name, you can do this, with a list of lists instead of a dictionary.

Each row of returned data is represented in the returned list as a list of field (column) values.
The column names can be provided as the first entry of the returned list, so parsing the returned list in the calling routine can be really easy and flexible.
In this way, the routine doing the database call doesn't need to know anything about the data that it's handling. Here is such a routine:

    def read_DB_Records(self, tablename, fieldlist, wherefield, wherevalue) -> list:

        DBfile = 'C:/DATA/MyDatabase.accdb'
        # this connection string is for Access 2007, 2010 or later .accdb files
        conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ='+DBfile)
        cursor = conn.cursor()

        # Build the SQL Query string using the passed-in field list:
        SQL = "SELECT "
        for i in range(0, len(fieldlist)):
            SQL = SQL + "[" + fieldlist[i] + "]"
            if i < (len(fieldlist)-1):
                SQL = SQL + ", "
        SQL = SQL + " FROM " + tablename

        # Support an optional WHERE clause:
        if wherefield != "" and wherevalue != "" :
            SQL = SQL + " WHERE [" + wherefield + "] = " + "'" + wherevalue + "';"

        results = []    # Create the results list object

        cursor.execute(SQL) # Execute the Query

        # (Optional) Get a list of the column names returned from the query:
        columns = [column[0] for column in cursor.description]
        results.append(columns) # append the column names to the return list

        # Now add each row as a list of column data to the results list
        for row in cursor.fetchall():   # iterate over the cursor
            results.append(list(row))   # add the row as a list to the list of lists

        cursor.close()  # close the cursor
        conn.close()    # close the DB connection

        return results  # return the list of lists
SherylHohman
  • 16,580
  • 17
  • 88
  • 94
Grimravus
  • 61
  • 1
  • 4
1

The steps are as follows:

  1. Import Libs:
    from pandas import DataFrame
    import pyodbc
    import sqlalchemy
  1. Get your results from the local db:
db_file = r'xxx.accdb'
odbc_conn_str = 'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=%s' % (db_file)

conn = pyodbc.connect(odbc_conn_str)
cur = conn.cursor() 
qry = cur.execute("SELECT * FROM tbl")
columns = [column[0] for column in cur.description]

results = []
for row in cur.fetchall():
 
   results.append(dict(zip(columns, row)))
df = DataFrame(results) 
df
SherylHohman
  • 16,580
  • 17
  • 88
  • 94
0

I know its old an I am just recapping what others already said. But I found this way neat, as its also injection safe.

def to_dict(row):
    return dict(zip([t[0] for t in row.cursor_description], row))

def query(cursor, query, params=[], cursor_func=to_dict):
    cursor.execute(query, params) 
    results = [cursor_func(row) for row in cursor.fetchall()]
    return results

quotes = query(cursor, "select * from currency where abbreviation like ?", ["USD"])
The Fool
  • 16,715
  • 5
  • 52
  • 86