0

I'm trying to find a similar record between two data sets in a dictionary with which to do further comparisons.

I've confirmed with a print statement that it is finding a matching data set (so all of the code before the final if statement is working). However it is not setting the matchingSet2Record for some reason. This causes the final if statement to always run even though it is finding a match. Declaring the variable as being in the global variable scope does not work. What is causing this to happen? How do I set the first mathingSet2Record to the discovered record in the for loop?

The only problem I'm having with this code is that even though matchingSet2Record is set to the found record properly, it still has a value of None when trying to compare it in the final if statement. The comparison logic is functioning properly.

I have the following function:

def processFile(data):
    # Go through every Record
    for set1Record in data["Set1"]:
        value1 = set1Record["Field1"].strip()
        matchingSet2Record = None

        # Find the EnergyIP record with the meter number
        for set2Record in data["Set2"]:
            if set2Record["Field2"].strip() == value1:
                global matchingSet2Record 
                matchingSet2Record = set2Record 

        # If there was no matching Set2 record, report it
        if matchingSet2Record == None:
            print "Missing"

Updated code per answers/comments (still exhibiting the same issue)

def processFile(data):
    # Go through every Record
    for set1Record in data["Set1"]:
        value1 = set1Record["Field1"].strip()
        matchingSet2Record = None

        # Find the EnergyIP record with the meter number
        for set2Record in data["Set2"]:
            if set2Record["Field2"].strip() == value1:
                matchingSet2Record = set2Record 

        # If there was no matching Set2 record, report it
        if matchingSet2Record == None:
            print "Missing"

"data" is a dictionary of dictionaries. That portion of the code is working properly. When I print matchingSet2Record within the for loop that set's it to the matching record it shows that the variable was set properly, however when I do it outside of the for loop it shows a value of None. That is the problem that I'm exploring with this code. The problem does not have anything to do with the code that finds a matching record.

ndmeiri
  • 4,979
  • 12
  • 37
  • 45
Kirkland
  • 798
  • 1
  • 8
  • 20
  • if you want it to stop at a certain point in the function, that is what the `return` statement is for. – Tadhg McDonald-Jensen Dec 15 '17 at 20:42
  • That isn't the issue that's being described. I want to be able to change the value of the matchingSet2Record declared as None to the record that is found in the first for..in loop so that it can be accessed in the final if statement. – Kirkland Dec 15 '17 at 20:43
  • the code you have posted raises a `SyntaxError` because `matchingSet2Record` is assigned before it is declared global, please clarify your question because I clearly don't understand what your issue is and your code is invalid. – Tadhg McDonald-Jensen Dec 15 '17 at 20:45
  • what is the type of `data`? Is it a dict? Or are parts of it iterables? – hansaplast Dec 15 '17 at 20:47
  • data is a dictionary of dictionaries. That portion of the code is working properly. When I print matchingSet2Record within the for loop that set's it to the matching record it shows that the variable was set properly, however when I do it outside of the for loop it shows a value of None – Kirkland Dec 15 '17 at 20:49

2 Answers2

2

Don't use the global keyword here. You actually want to set the local variable matchingSet2Record, not the global one.

The code you have is actually setting the value of a variable at the global scope, which is in effect leaving the local variable matchingSet2Record unchanged. This causes the condition of your if statement to always evaluate to True since the value of matchingSet2Record was never updated to non-None.

ndmeiri
  • 4,979
  • 12
  • 37
  • 45
1

This is not a final answer but it's just too much to put into a comment.

I tried to reproduce your problem with an actual dict of data. But your code really works. There needs to either be

  • some pecularities of data (I've seen strange effects when e.g. iterating twice over an iterable, because the iterable was already "consumed")
  • or it really does not match, because you have some invisible differences between the two strings in Field1 and Field2.

This works:

def processFile(data):
    # Go through every Record
    for set1Record in data["Set1"]:
        value1 = set1Record["Field1"].strip()
        matchingSet2Record = None

        # Find the EnergyIP record with the meter number
        for set2Record in data["Set2"]:
            if set2Record["Field2"].strip() == value1:
                matchingSet2Record = set2Record 

        # If there was no matching Set2 record, report it
        if matchingSet2Record == None:
            print("Missing")
        else:
            print("Found")

if __name__ == '__main__':
    data = dict(Set1=[dict(Field1='value1')], Set2=[dict(Field2='value1')])
    processFile(data)

It prints Found

Edit:

If you're into learning python then you can write the above shorter like this:

data = dict(Set1=[dict(Field1='value1')], Set2=[dict(Field2='value1 ')])
for value1 in [v['Field1'].strip() for v in data['Set1']]:
    try:
        matchingSet2Record = (v for v in data['Set2'] if v['Field2'].strip() == value1).next()
        print("found {}".format(matchingSet2Record))
    except StopIteration:
        print('Missing')

The last line does a generator: (. for . in .) creates a generator and next() makes it generate until it finds the first match. If you get a miss, you'll hit the StopIteration exception.

Or, alternatively, if you're just into finding out if there are overlaps between the Set1 and Set2 you could do:

data = dict(Set1=[dict(Field1='value1')], Set2=[dict(Field2='value1')])
field1 = [a['Field1'].strip() for a in data['Set1']]
field2 = [a['Field2'].strip() for a in data['Set2']]
if not set(field1).isdisjoint(field2):
    print('there is at least 1 common element between Set1 and Set2')

See this answer to read more about the isdisjoint part.

hansaplast
  • 11,007
  • 2
  • 61
  • 75
  • Ok, I've accepted that as an answer to this question (still wondering why I saw it print a value of "None" before I posted the question). But now it looks like value1 is iterating through the list properly however the value1 in the nested for loop is not changing value. This was determined by adding a print statement after the for loop but before the if statement. Any ideas what could be causing this? – Kirkland Dec 16 '17 at 01:39
  • can you rephrase your question? there's no reassignment of the variable `value1` in the nested (=inner) loop, but only in the outer loop – hansaplast Dec 16 '17 at 06:40