2

I have loaded some data into a pandas dataframe from a CSV file, exported from Postgres. One of the columns comes from a PostgreSQL array field, and looks like so (note that it is a string enclosing the array):

"[
    {"keyOne": "valueOne", "keyTwo": "valueTwo"},
    {"keyOne": "valueOne", "keyTwo": "valueTwo"},
    ...
]"

Note: the object keys are all double quoted. However, not all of the values are double quoted, as some were originally boolean.

How can I parse this into a Python list with the following form:

[
      {"keyOne": "valueOne", "keyTwo": "valueTwo"},
      {"keyOne": "valueOne", "keyTwo": "valueTwo"},
]
Brylie Christopher Oxley
  • 1,684
  • 1
  • 17
  • 34
  • Are all of the quotes double quotes? And, can you post an example of the list you want? (e.g., all elements, in order) – Evan Jan 12 '18 at 16:10
  • I updated the question with regards to your questions. Thanks ☺ – Brylie Christopher Oxley Jan 12 '18 at 16:27
  • Got it. Can you try this? https://stackoverflow.com/questions/3085382/python-how-can-i-strip-first-and-last-double-quotes I'm having trouble getting a field like yours into pandas via read_csv. – Evan Jan 12 '18 at 16:41
  • 1
    Ok, cool. Will that return a list? I might be able to use df.column.apply(json.loads). – Brylie Christopher Oxley Jan 12 '18 at 16:45
  • If you remove the first and last double quotes, you should have a list, yes. The issue with the repeated double quotes is that you have a string - `"[{"` - followed by essentially a variable declaration - `keyOne` - followed by a string - `": "`, etc. Stripping the first and last quotes should be all you need. LMK! – Evan Jan 12 '18 at 17:46

0 Answers0