I've been attempting to compare two lists of dictionaries, and to find the userid's of new people in list2 that aren't in list1. For example the first list:
list1 = [{"userid": "13451", "name": "james", "age": "24", "occupation": "doctor"}, {"userid": "94324""name": "john", "age": "33", "occupation": "pilot"}]
and the second list:
list2 = [{"userid": "13451", "name": "james", "age": "24", "occupation": "doctor"}, {"userid": "94324""name": "john", "age": "33", "occupation": "pilot"}, {"userid": "34892", "name": "daniel", "age": "64", "occupation": "chef"}]
the desired output:
newpeople = ['34892']
This is what I've managed to put together:
list1tuple = ((d["userid"]) for d in list1)
list2tuple = ((d["userid"]) for d in list2)
newpeople = [t for t in list2tuple if t not in list1tuple]
This actually seems to be pretty efficient, especially considering the lists I am using might contain over 50,000 dictionaries. However, here's the issue:
If it finds a userid in list2 that indeed isn't in list1, it adds it to newpeople (as desired), but then also adds every other userid that comes afterwards in list2 to newpeople as well.
So, say list2 contains 600 userids and the 500th userid in list2 isn't found anywhere in list1, the first item in newpeople will be the 500th userid (again, as desired), but then followed by the other 100 userids that came after the new one.
This is pretty perplexing to me - I'd greatly appreciate anyone helping me get to the bottom of why this is happening.