-1

With the following code for parsing JSON-esque data:

import re
data = open('toy.json', 'r')

regexp = re.compile("gas")

for line in data: 
    print(line)
    Result = re.search(regexp, line)
    if Result:
        print Result.groups()

I'd like to extract all values associated with the keyword gas, and hash, the data looks like this:

{
  "blockNumber": "1941794",
  "blockHash": "0x41ee74e34cbf9ef4116febea958dbc260e2da3a6bf6f601bfaeb2cd9ab944a29",
  "hash": "0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63",
  "from": "0x3c0cbb196e3847d40cb4d77d7dd3b386222998d9",
  "to": "0x2ba24c66cbff0bda0e3053ea07325479b3ed1393",
  "gas": "121000",
  "gasUsed": "21000",
  "gasPrice": "20000000000",
  "input": "",
  "logs": [],
  "nonce": "14",
  "value": "0x24406420d09ce7440000",
  "timestamp": "2016-07-24 20:28:11 UTC"
}
{
  "blockNumber": "1941716",
  "blockHash": "0x75e1602cad967a781f4a2ea9e19c97405fe1acaa8b9ad333fb7288d98f7b49e3",
  "hash": "0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26",
  "from": "0xa0480c6f402b036e33e46f993d9c7b93913e7461",
  "to": "0xb2ea1f1f997365d1036dd6f00c51b361e9a3f351",
  "gas": "121000",
  "gasUsed": "21000",
  "gasPrice": "20000000000",
  "input": "",
  "logs": [],
  "nonce": "1",
  "value": "0xde0b6b3a7640000",
  "timestamp": "2016-07-24 20:12:17 UTC"
}

so ideally the result would be something like:

  "hash": "0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26",
  "gas": "121000",
  "hash": "0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63",
  "gas": "121000",

but what I get is not nearly that at all.

smatthewenglish
  • 2,831
  • 4
  • 36
  • 72

2 Answers2

1

I don't have enough reputation to add a comment, but unless you really need to use regex for some reason, I would use python's json library. See this answer for details on loading json to a python dictionary and extracting the values.

kdd
  • 436
  • 3
  • 9
0

I agree with kdd. There's no reason to use regex here, but I think your json is problematic.

Assuming you only have a single file, I think it should look something like this:

{
  "entry": [
    {
    "blockNumber": "1941794",
    "blockHash": "0x41ee74e34cbf9ef4116febea958dbc260e2da3a6bf6f601bfaeb2cd9ab944a29",
    "hash": "0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63",
    "from": "0x3c0cbb196e3847d40cb4d77d7dd3b386222998d9",
    "to": "0x2ba24c66cbff0bda0e3053ea07325479b3ed1393",
    "gas": "121000",
    "gasUsed": "21000",
    "gasPrice": "20000000000",
    "input": "",
    "logs": [],
    "nonce": "14",
    "value": "0x24406420d09ce7440000",
    "timestamp": "2016-07-24 20:28:11 UTC"
    },
    {
    "blockNumber": "1941716",
    "blockHash": "0x75e1602cad967a781f4a2ea9e19c97405fe1acaa8b9ad333fb7288d98f7b49e3",
    "hash": "0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26",
    "from": "0xa0480c6f402b036e33e46f993d9c7b93913e7461",
    "to": "0xb2ea1f1f997365d1036dd6f00c51b361e9a3f351",
    "gas": "121000",
    "gasUsed": "21000",
    "gasPrice": "20000000000",
    "input": "",
    "logs": [],
    "nonce": "1",
    "value": "0xde0b6b3a7640000",
    "timestamp": "2016-07-24 20:12:17 UTC"
  }
]
}

Then it's just

import json
with open('your_file.json') as f:
    data = json.load(f)
for entry in data:
    print(entry['hash'])
    print(entry['gas'])
Brendan A.
  • 1,268
  • 11
  • 16