1

I'm looking to extract sets of values from a JSON and write them to a file.

The format of the JSON is as follows:

    "interactions":     [
    {
        "type": "free",
        "input":             [
            [ 1, 4594, 119218, 0, [71, 46], [2295, 1492], [71, 46], [2295, 1492], 16017, 520790446, [71, 46, 71, 46], [71, 46, 71, 46] ],
            [ 1, 4594, 119219, 0, [72, 46], [2323, 1492], [72, 46], [2323, 1492], 26016, 520790456, [72, 46, 72, 46], [72, 46, 72, 46] ],
            [ 1, 4594, 119220, 0, [72, 45], [2323, 1464], [72, 45], [2323, 1464], 26016, 520790466, [72, 45, 72, 45], [72, 45, 72, 45] ],
            [ 1, 4594, 119221, 0, [72, 45], [2323, 1464], [72, 45], [2323, 1464], 26016, 520790476, [72, 45, 72, 45], [72, 45, 72, 45] ],
            [ 1, 4594, 119222, 0, [73, 45], [2350, 1464], [73, 45], [2350, 1464], 26016, 520790486, [73, 45, 73, 45], [73, 45, 73, 45] ],
            [ 1, 4594, 119223, 0, [73, 45], [2350, 1464], [73, 45], [2350, 1464], 26016, 520790496, [73, 45, 73, 45], [73, 45, 73, 45] ],
            [ 1, 4594, 119224, 0, [73, 45], [2350, 1464], [73, 45], [2350, 1464], 46000, 520790506, [73, 45, 73, 45], [73, 45, 73, 45] ]
        ]

What I need to extract, is the [71, 46] column, and then the column which starts with 520790446, and write it to an output file.

Below is the code I've got at the minute:

import json

json_data = open("test_json.json")

data = json.load(json_data)

json_data.close()

# Need some sort of nested loop here to iterate through each line of the block, and each block also.
print data["interactions"][0]["input"][0][4], '\t', data["interactions"][0]["input"][0][9]

There are several of these blocks of variable length, and I need to extract all the values until the end of the file. I'm stuck at the loop structure though.

Could anyone be of assistance?

MattH
  • 37,273
  • 11
  • 82
  • 84
Matthew
  • 1,179
  • 2
  • 12
  • 16

2 Answers2

2

You can get at the data like so:

[x[4] for x in data["interactions"][0]["input"]]

[x[9] for x in data["interactions"][0]["input"]]

or in one go, something like

[[x[4], x[9]] for x in data["interactions"][0]["input"]]

To answer the first part of the comment:

[[x[4], x[9]] for x in interaction["input"] for interaction in data["interactions"]]
YXD
  • 31,741
  • 15
  • 75
  • 115
  • How would I go about iterating the block after ["interactions"] above ([0]) also? That's the block which iterates through the entire JSON file, I believe. Also, can this be outputted to a .csv file or something similar with a line break after each line? e.g. "[71, 46] 520790446" on each line. – Matthew Apr 08 '13 at 14:06
  • See update. Best to ask a separate question for outputting the data or better still have a search here... – YXD Apr 08 '13 at 14:13
  • Sorry to be a plague, but I'm getting "NameError: name 'interaction' is not defined" now. I'm not long using Python, and the loop structures are still foreign to me. – Matthew Apr 08 '13 at 14:25
  • I don't have your full data so I can't really test but I would be first seeing what `[interaction for interaction in data["interactions"]]` outputs to the console, then chop out the bit with `[interaction["input"] for interaction in data["interactions"]]` then iterate over those as in the question. Where does it break down? – YXD Apr 08 '13 at 14:29
  • I reversed the fors and it seems to work now. The I'm using `[[x[4], x[9]] for interaction in data["interactions"] for x in interaction["input"]]` Thank you for your help! – Matthew Apr 08 '13 at 14:35
0
def gen_vals(data):
    for i in xrange(len(data["interactions"])):
        for j in data["interactions"][i]["input"]:
            yield (j[4], j[9])

this is a generator that can be used as such:

vals = [x for x in gen_vals(data)]
Preet Kukreti
  • 8,417
  • 28
  • 36