-2

I am trying to extract the arrival time values from the following, which appears to be a list of dictionaries:

[{'arrival': {'time': 1508791028L},
  'departure': {'time': 1508791028L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508790596L},
  'departure': {'time': 1508790596L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508791744L},
  'departure': {'time': 1508791744L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508792223L},
  'departure': {'time': 1508792223L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508793450L},
  'departure': {'time': 1508793450L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508792591L},
  'departure': {'time': 1508792591L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508794110L},
  'departure': {'time': 1508794110L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508794740L},
  'departure': {'time': 1508794740L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508788421L},
  'departure': {'time': 1508788421L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508788919L},
  'departure': {'time': 1508788919L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508789417L},
  'departure': {'time': 1508789417L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508790287L},
  'departure': {'time': 1508790287L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508790347L},
  'departure': {'time': 1508790347L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508791330L},
  'departure': {'time': 1508791330L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508791799L},
  'departure': {'time': 1508791799L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508792447L},
  'departure': {'time': 1508792447L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508793300L},
  'departure': {'time': 1508793300L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508793840L},
  'departure': {'time': 1508793840L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508794380L},
  'departure': {'time': 1508794380L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508794800L},
  'departure': {'time': 1508794800L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'},
 {'arrival': {'time': 1508795220L},
  'departure': {'time': 1508795220L},
  'schedule_relationship': 0,
  'stop_id': u'D03N'}]

However, when I started to try and understand the structure of the list with a simple:

for i in in_dict:
    print i
    print "************************"

I got an output like this:

[{'arrival': {'time': 1508791028L},

************************
  'departure': {'time': 1508791028L},

************************
  'schedule_relationship': 0,

************************
  'stop_id': u'D03N'},

************************
 {'arrival': {'time': 1508790596L},

************************
  'departure': {'time': 1508790596L},

That output suggests to me that the elements in the list are possibly not really dictionaries (for example "{'arrival': {'time': 1508790596L}" seems like it needs another "}" to be structured correctly).

My primary question is what is the best way to extract the arrival times from this data? My secondary question is, is this actually a list of dictionaries or just a list of items that happen to share a resemblance with a list of dictionaries?

mweinberg
  • 161
  • 11
  • 2
    yeah, your "dict" is a list of lines. You probably read a file line by line, not using `json` or `ast.literal_eval` like you should have done. – Jean-François Fabre Oct 23 '17 at 20:29
  • so it's a list of pure strings, right? – RomanPerekhrest Oct 23 '17 at 20:44
  • 1
    This does look like "record" orientation of json except that we could use more info as to where your obtaining it, how it's stored, etc – Brad Solomon Oct 23 '17 at 20:49
  • @Jean-FrançoisFabre I was worried about that, thanks. – mweinberg Oct 23 '17 at 20:50
  • @BradSolomon it's NYC transit GTFS data. If it was presented as json this entire enterprise would be much, much easier. That is a string - I think I'm just going to end up with a regex search because the data is almost structured. – mweinberg Oct 23 '17 at 20:52
  • 1
    But ... do you have it as a variable? As a text file? It looks precisely like json. – Brad Solomon Oct 23 '17 at 20:52
  • 1
    I ask because it's unclear how 1508791028L, 1508791028L are formatted like they are, without surrounding quotes – Brad Solomon Oct 23 '17 at 20:53
  • It is a file outputted from the "more efficient way" here: https://stackoverflow.com/questions/46514274/extract-elements-from-complex-list-of-lists-and-dictionaries-in-python and then loaded in to a new testing script as a file. – mweinberg Oct 23 '17 at 20:54
  • @BradSolomon ah I understand why you asked that. In saving the dictionary to a file I screwed up the entire structure. Thanks! – mweinberg Oct 23 '17 at 20:59
  • @mweinberg You know you can call json.loads(my_var) on a string to get it as a python json object right? That looks like it would handle pretty much everything... – mbrig Oct 23 '17 at 21:01
  • 1
    [`pd.read_json`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html) with `orient='records'` may also be helpful here. – Brad Solomon Oct 23 '17 at 21:02

1 Answers1

0

I created this fake dictionary by writing an existing dictionary to a file with:

with open('small_list.txt', 'wt') as out:
     pprint(small_list, stream=out)

and then importing the small_list.txt file into a simple python script to try and play with it without the rest of the script getting in the way. However, as @bradsolomon has helped me realize in the comments, in writing the file in that way I totally borked the structure. This created the decoy dictionary. Therefore, the answer to this question is probably "don't save a list of dictionaries with stream=out" or something similar.

mweinberg
  • 161
  • 11