1

My Problem

I have results from a SQL query that contains a list of tuples of only values, no column names.

A subset of my data might look like this:

dbSqlResults = [
    (u'Dave', u'Smith', u'dave@smith.bla with extra spaces   ', 1, 2, 3.2),
    (u'Valerie', u'Conklin', u'valconk@gmail.bla with extra spaces   ', 4, 5, 6.5),
    (u'Mighty Joe', u'Frazier', u'toughstuff@ko.bla with extra spaces   ', 7, 8, 9.8)
]

I want to convert this to a list of dicts and reuse the same list of keys for each dict entry, while iterating over the rows to add the values. The key list length and the number of values in each row tuple are the same dimension; both contain 11 entries in my actual data. And some of the values need whitespace trimmed prior to assignment to the dict.

The list of keys I want to apply to each tuple might look like this:

keys = ['first_name', 'last_name', 'email', 'partners', 'kids', 'meals_together']

And when properly merged I want this to end up looking like:

listOfDicts = [
    {'first_name' : 'Dave', 'last_name' : 'Smith', 'email' : 'dave@smith.bla', 'partners' : 1, 'kids' : 2, 'meals_together' : 3.2},
    {'first_name' : 'Valerie', 'last_name' : 'Conklin', 'email' : 'valconk@gmail.bla', 'partners' : 4, 'kids' : 5, 'meals_together' : 6.5},
    {'first_name' : 'Mighty Joe', 'last_name' : 'Frazier', 'email' : 'toughstuff@ko.bla', 'partners' : 7, 'kids' : 8, 'meals_together' : 9.8}
]

Research

I looked at Create a dictionary with list comprehension in Python but it doesn’t help as it assumes (k,v) whereas each tuple in my list has no keys and 11 actual values.

I looked at Different list values for dictionary keys but the output is wrong, I want to repeatedly apply a single list of keys to each tuple in my results as I convert them to a list of dicts.

I considered https://stackoverflow.com/a/5087977 but it appears to repeatedly iterate over the row array to obtain each column value, and doesn’t appear to be a graceful application of list comprehension.

My Solution

The code I came up with to solve this:

# assuming rows looks like dbSqlResults as declared above...
def sqlResultsToLOD(rows):
    keys = ['first_name', 'last_name', 'email', 'partners', 'kids', 'meals_together']
    result = []
    for first_name, last_name, email, partners, kids, meals in rows:
        result.append({
            keys[0] : first_name.strip(),
            keys[1] : last_name.strip(),
            keys[2] : email.strip(),
            keys[3] : partners,
            keys[4] : kids,
            keys[5] : meals
        })
    return results

This gives me my desired output but it doesn't feel very pythonic. It feels more like I'm writing JavaScript.

My Question

What's the more pythonic and graceful solution for this?

A couple of details as performance is a concern:

  • I currently have 11 values in each tuple
  • My result set currently contains 1.6 million rows of tuples, and the row count grows daily
  • Some values in each tuple need whitespace trimmed prior to assignment (see dbSqlResults data, 3rd field for example)

Thanks in advance for any tips, and apologies as well if I missed the answer in my research.

Community
  • 1
  • 1
soultech
  • 9
  • 2

1 Answers1

4

Using zip

>>> import pprint
>>> dbSqlResults = [
...     (u'Dave', u'Smith', u'dave@smith.bla with extra spaces   ', 1, 2, 3.2),
...     (u'Valerie', u'Conklin', u'valconk@gmail.bla with extra spaces   ', 4, 5, 6.5),
...     (u'Mighty Joe', u'Frazier', u'toughstuff@ko.bla with extra spaces   ', 7, 8, 9.8)
... ]
>>> keys = ['first_name', 'last_name', 'email', 'partners', 'kids', 'meals_together']
>>> pprint.pprint([dict(zip(keys, record)) for record in dbSqlResults])
[{'email': u'dave@smith.bla with extra spaces   ',
  'first_name': u'Dave',
  'kids': 2,
  'last_name': u'Smith',
  'meals_together': 3.2,
  'partners': 1},
 {'email': u'valconk@gmail.bla with extra spaces   ',
  'first_name': u'Valerie',
  'kids': 5,
  'last_name': u'Conklin',
  'meals_together': 6.5,
  'partners': 4},
 {'email': u'toughstuff@ko.bla with extra spaces   ',
  'first_name': u'Mighty Joe',
  'kids': 8,
  'last_name': u'Frazier',
  'meals_together': 9.8,
  'partners': 7}]

You can also use a generator function if you don't load all your data in memory.

def to_dict(records, keys):
    for record in records:
        for index, value in enumerate(record):
            if isinstance(value, (unicode, str)):
                record[index] = value.strip()
        yield dict(zip(keys, record))

Then:

for d in to_dict(dbSqlResults, keys):
    # do something
styvane
  • 59,869
  • 19
  • 150
  • 156