1

I have List of multiple dictionaries inside it(as JSON ).I have a list of value and based on that value I want that JSON object for that particular value. For eg.

[{'content_type': 'Press Release',
  'content_id': '1',
   'Author':John},
{'content_type': 'editorial',
  'content_id': '2',
   'Author': Harry
},
{'content_type': 'Article',
  'content_id': '3',
   'Author':Paul}]

I want to to fetch complete object where author is Paul. This is the code I have made so far.

import json
newJson = "testJsonNewInput.json"
ListForNewJson = []
def testComparision(newJson,oldJson):
   with open(newJson, mode = 'r') as fp_n:
    json_data_new = json.load(fp_n) 
for jData_new in json_data_new:
    ListForNewJson.append(jData_new['author'])

If any other information required, please ask.

Neil
  • 14,063
  • 3
  • 30
  • 51
john
  • 85
  • 2
  • 10

2 Answers2

1

Case 1
One time access

It is perfectly alright to read your data and iterate over it, returning the first match found.

def access(f, author):
    with open(file) as f:
        data = json.load(f)

    for d in data:
        if d['Author'] == author:
            return d
    else:
        return 'Not Found'

Case 2
Repeated access

In this instance, it would be wise to reshape your data in such a way that accessing objects by author names is much faster (think dictionaries!).

For example, one possible option would be:

with open(file) as f:
    data = json.load(f)

newData = {}
for d in data:
    newData[d['Author']] = d

Now, define a function and pass your pre-loaded data along with a list of author names.

def access(myData, author_list):
    for a in author_list:
        yield myData.get(a)

The function is called like this:

for i in access(newData, ['Paul', 'John', ...]):
    print(i)

Alternatively, store the results in a list r. The list(...) is necessary, because yield returns a generator object which you must exhaust by iterating over.

r = list(access(newData, [...]))
cs95
  • 379,657
  • 97
  • 704
  • 746
0

Why not do something like this? It should be fast and you will not have to load the authors that wont be searched.

alreadyknown = {}
list_of_obj = [{'content_type': 'Press Release',
    'content_id': '1',
    'Author':'John'},
    {'content_type': 'editorial',
    'content_id': '2',
    'Author': 'Harry'
    },
    {'content_type': 'Article',
    'content_id': '3',
    'Author':'Paul'}]
def func(author):
    if author not in alreadyknown:
        obj = get_obj(author)
        alreadyknown[author] = obj
    return alreadyknown[author]
def get_obj(auth):
    return [obj for obj in list_of_obj if obj['Author'] is auth]
print(func('Paul'))
Abhijeetk431
  • 847
  • 1
  • 8
  • 18