-1

I have read a CSV file into Pandas and converted the resulting dataframe into a list of dictionaries for each row using the 'to_dict(orient='records') function. A shortened version of the list looks like this:

records = [{'addjob': 'ADDJOB',
  'age': 'AGE',
  'disab': 'DISCURR13',
  'eth': 'ETHUKEUL',
  'full': 'FTPT'},
{'addjob': 'ADDJOB',
  'age': 'AGE',
  'disab': 'DISCURR13',
  'eth': 'ETHUKEUL',
  'full': nan}]

I am trying to imitate this structure by using a dict comprehension like so:

    cleaned_records = OrderedDict([
        {k:v for k,v in i} for i in records
    ])

EDIT: removed 'OrderedDict' as it was a mistake (error is the same):

    cleaned_records = [{k:v for k,v in i} for i in records]

However, it is giving me the following error:

enter image description here

The reason I am trying to do this is so I can remove those keys from the dictionaries whose values are null before passing them to another set of functions.

I've been at this for quite a while now and I'm baffled as to why this dict comprehension is not working. Can anyone help me?

  • 1
    Because iterating over dictionaries iterates over the *keys*. You try to unpack a key into a tuple, `k,v` which fails. Also, I'm not sure what you are trying to accomplish with an OrderedDict here. – juanpa.arrivillaga Feb 28 '17 at 16:25
  • Why don't you apply a query/conditional to the dataset to filter out what you don't want, then use `.to_dict` on that? – Jon Clements Feb 28 '17 at 16:27
  • @juanpa.arrivillaga the OrderedDict was an error that I realised only after posting the question; I thought it was a dictionary of dictionaries at first , which is what I'll be doing with it afterwards. – Marc Lawson Feb 28 '17 at 16:29
  • the only thing that confuses me here is that this answer says to use tuples in this way: http://stackoverflow.com/a/1747827/6315440. When I use [{k: v for k in i} for i in records] instead all the values are the same for the keys, which I don't want. – Marc Lawson Feb 28 '17 at 16:45
  • You aren't understanding that answer, but to be fair, it isn't being explicit. That expression should give you a NameError for `v`. – juanpa.arrivillaga Feb 28 '17 at 17:11

1 Answers1

1

You're just missing the .items() or .iteritems() on the dict extraction.

In [28]: [{k:v for k,v in i.iteritems()} for i in records]
Out[28]: 
[{'addjob': 'ADDJOB',
  'age': 'AGE',
  'disab': 'DISCURR13',
  'eth': 'ETHUKEUL',
  'full': 'FTPT'},
 {'addjob': 'ADDJOB',
  'age': 'AGE',
  'disab': 'DISCURR13',
  'eth': 'ETHUKEUL',
  'full': nan}]
clocker
  • 1,376
  • 9
  • 17
  • Thanks @clocker, that got it working! I thought I had tried that but must have made a mistake. I'm confused by what juanpa.arrivillaga said above though because I seem to have been on the right tracks and not misunderstanding fortran's answer after all. I'm fairly new to Python so easily confused at this stage. – Marc Lawson Feb 28 '17 at 17:23
  • No worries, but no need to say Thanks. I've been told that it messes up auto checking in stackoverflow and might be confused with spam. Please up-vote an answer instead if it works for you. Have fun with python! – clocker Feb 28 '17 at 17:31