2

I pass a list to a function and look at the output. When the list is hardcoded, I've got the expected output. But when I build the list from a string, and pass that list with the same content to the function, I don't have the expected output.

First call:

tech = [ "Django", "Zend", "SQLite", "foo" ]

for tech_item in tech:
    print( test_main( tech_item ) )

Second call:

raw_txt = "Django,Zend,SQLite,foo"
tech = raw_txt.split( "," )
# At this point if we print "tech" we have:
# [ "Django", "Zend", "SQLite", "foo" ]

for tech_item in tech:
    print( test_main( tech_item ) )

So, the input is/seems to be identical, but the output is different.

When I compare the content of the two lists (renaming the second list: tech2) I have:

print( tech[0], tech2[0] ) #Django Django
print( tech[0] == tech2[0] ) #True
print( type(tech[0]), type(tech2[0]) ) #<class 'str'> <class 'str'>
print( len(tech[0]), len(tech2[0]) ) #6 6

What am I missing? Do you have any clue on how to find/resolve this?

Edit:

Output I've got for the 1st case:

frameworks
frameworks
SQL
None

Output I've got for the 2nd case:

None
None
None
None

test_main function

I give you the test_main function, but I'm afraid it is going to confuse you. So, "looking_for" is the same each time. But "tmp" is different in both cases.

def test_main( looking_for ):
    global tmp
    tmp = None

    get_recursively( languages_tech, looking_for )

    return tmp

get_recursively function

def get_recursively( data, looking_for, last_key="" ):
    if not isinstance( data, (list, dict) ):
        if data is looking_for: #item
            global tmp
            tmp = last_key
    else:
        if isinstance( data, dict ): #Dictionaries
            for key, value in data.items():
                get_recursively( value, looking_for, key )
        else:
            for item in data: #list
                get_recursively( item, looking_for, last_key )

Languages_tech

languages = { "languages": [
"Ruby", "Python", "JavaScript", "ASP.NET", "Java", "C", "C++", "C#", "Swift", "PHP", "Visual Basic", "Bash" ] }

frameworks = { "frameworks" : [
"Django", "Flask", "React", "React Native", "Vue", "Ember", "Meteor", "AngularJS", "Express" , "Laravel", "Symfony", "Zend", "Ruby on Rails" ] }

databases = { "databases" : [
{ "SQL": ["MariaDB", "MySQL", "SQLite", "PostgreSQL", "Oracle", "MSSQL Server"] },
{ "NoSQL": ["Cassandra", "CouchDB", "MongoDB", "Neo4j", "OrientDB", "Redis", "Elasticsearch"] },
{ "ORM Framework": [ "SQLAlchemy", "Django ORM" ] } ] }

languages_tech = { "languages_tech": [ languages, frameworks, databases ]  }
seb_seb
  • 39
  • 4
  • `print( tech[0] is tech2[0] )` will always print `False` if `tech` and `tech2` are two different lists. – TrebledJ Dec 03 '18 at 09:16
  • 2
    You usualy don't want to use `is` to compare strings. Strings that compare equal are not necessarily the same object. – Stop harming Monica Dec 03 '18 at 09:45
  • In `get_recursively` you should compare strings with `==`, not with `is`: `if data == looking_for: #item`. This is explained here: [Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?](https://stackoverflow.com/questions/1504717/why-does-comparing-strings-in-python-using-either-or-is-sometimes-produce). (Could someone mark this question as a duplicate? I made a mistake by deleting the "possible duplicate" comment.) – Georgy Dec 03 '18 at 10:03
  • 1
    Possible duplicate of [Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?](https://stackoverflow.com/questions/1504717/why-does-comparing-strings-in-python-using-either-or-is-sometimes-produce) – b-fg Dec 03 '18 at 10:10

1 Answers1

2

Short Answer

The following line in your get_recursively() function is erroneous

if data is looking_for:

Use this instead

if data == looking_for:

Long Answer

a is b will only evaluate to true if a and b have the same id. That is,

(a is b) == (id(a) == id(b))

By default, string literals are assigned the same id. For example,

>>> a = "Trebuchet"
>>> b = "Trebuchet"
>>> id(a), id(b)
(4416391792, 4416391792)

Note that both id's are 4416391792. This carries into lists as well (even though the lists aren't the same object and thus don't have the same id).

>>> a = ["Trebuchet", "Catapult", "Ballista"]
>>> b = ["Trebuchet", "Catapult", "Ballista"]
>>> id(a), id(b)
(4416392200, 4416861640)

>>> id(a[0]), id(b[0])          
(4416391792, 4416391792)

Note that 4416391792 is the exact same number from the previous example. This goes to show how strings point to the same object.

But when you bring in the str.split() function...

>>> a = "Trebuchet;Catapult;Ballista".split(';')
>>> b = ["Trebuchet", "Catapult", "Ballista"]
>>> id(a[0]), id(b[0])          
(4416392240, 4416391792)

id(b[0]) remains at 4416391792, we've seen that before. But now, note that str.split() creates a new string object with id = 4416392240 in the list!!!

This underlies the principle of why data is looking_for evaluates to false.

Of course, is has its merits. For instance, we do a is None and not a == None (read more). But it's important to discern when to use is and when to use ==. When comparing literals for values such as strings, lists, or tuples, use ==.


Further Reading:

Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?

Another example where variables can have congruent string literals but different ids.

TrebledJ
  • 8,713
  • 7
  • 26
  • 48