348

I need to know if a variable in Python is a string or a dict. Is there anything wrong with the following code?

if type(x) == type(str()):
    do_something_with_a_string(x)
elif type(x) == type(dict()):
    do_somethting_with_a_dict(x)
else:
    raise ValueError

Update: I accepted avisser's answer (though I will change my mind if someone explains why isinstance is preferred over type(x) is).

But thanks to nakedfanatic for reminding me that it's often cleaner to use a dict (as a case statement) than an if/elif/else series.

Let me elaborate on my use case. If a variable is a string, I need to put it in a list. If it's a dict, I need a list of the unique values. Here's what I came up with:

def value_list(x):
    cases = {str: lambda t: [t],
             dict: lambda t: list(set(t.values()))}
    try:
        return cases[type(x)](x)
    except KeyError:
        return None

If isinstance is preferred, how would you write this value_list() function?

Robert
  • 1,286
  • 1
  • 17
  • 37
Daryl Spitzer
  • 143,156
  • 76
  • 154
  • 173
  • 10
    IMHO, isinstance() is better because you test the type of certain variable against a certain class type, without having to allocate/create anything. I mean: when you do ``type(str())``, you are creating an instance of a str object just for the sake of obtaining its type. The object just created is then discarded and later garbage collected. You don't need anything like this because the type you are testing against is known in advance, so, it is more efficient if you do ``isinstance(variable, type)``. – Richard Gomes Jul 20 '13 at 13:04
  • 13
    @RichardGomes Actually, you confuse two different topics. If the goal is to avoid allocating a `str`, then the coder should simply say `str` instead of `type(str())`. Assuming the coder meant what they said, which is to test for an EXACT type. The point of `isinstance` is to allow subtypes. Which may or may not have been wanted. E.g. collections.OrderedDict is a subclass of dict, so IF the coder wants to allow those also, THEN it is correct to change the code from `type(x) == dict` to `isinstance(x, dict)`. NOT to avoid allocating, BUT to change the meaning to "a subclass is acceptable". – ToolmakerSteve Dec 12 '13 at 22:50
  • 3
    Allocating something empty and checking with `type()` is improper not just because of the useless object's instance, but even because you can't check, for example, if your object is a `file` without creating a file on filesystem (`type(file())` fails because `file()` requires at least an argument) – dappiu Sep 26 '14 at 23:55

10 Answers10

345

What happens if somebody passes a unicode string to your function? Or a class derived from dict? Or a class implementing a dict-like interface? Following code covers first two cases. If you are using Python 2.6 you might want to use collections.Mapping instead of dict as per the ABC PEP.

def value_list(x):
    if isinstance(x, dict):
        return list(set(x.values()))
    elif isinstance(x, basestring):
        return [x]
    else:
        return None
Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
Suraj
  • 4,717
  • 1
  • 18
  • 12
  • 2
    It would be nice to see how collections.Mapping enters into this discussion. What are its advantages? Can we see some example code, to understand how it compares? The ABC PEP link is rather heavy with theory and quite a lot to consume when the goal is to simply test if something is a dict or string. Is there additional effort to implement ABCs, and (especially for a simple use case) is the added effort worth it? – Mike S Jun 12 '15 at 14:52
  • 1
    Direct link to help page: https://docs.python.org/2.7/library/collections.html#collections-abstract-base-classes – Suraj Jun 16 '15 at 08:03
  • 5
    The collections.Mapping ABC provides a simple way to check if the object behaves like a dict. The change in code would be to replace `isinstance(x, dict)` with `isinstance(x, collections.Mapping)`. This provide additional benefit of matting objects that do not derive from dict but provide a similar interface. – Suraj Jun 16 '15 at 08:19
  • A great way to check for string-like (from Beginning Python: From Novice to Professional) is to try + '' and check for a TypeError. If TypeError is not raised, then it is string-like. Only downside is that I don't know the performance cost if the string is large. Maybe the interpreter is smart enough that there is essentially zero cost? I don't know. – Steve Jorgensen Oct 02 '19 at 11:14
  • @SteveJorgensen Other than being a clever trick, is there an advantage to using that method over isinstance? – Dymas Feb 26 '20 at 20:04
  • basestring was replaced by str in Python 3 – luckyguy73 May 08 '21 at 14:24
63

type(dict()) says "make a new dict, and then find out what its type is". It's quicker to say just dict. But if you want to just check type, a more idiomatic way is isinstance(x, dict).

Note, that isinstance also includes subclasses (thanks Dustin):

class D(dict):
    pass

d = D()
print("type(d) is dict", type(d) is dict)  # -> False
print("isinstance (d, dict)", isinstance(d, dict))  # -> True
Michael Geary
  • 28,450
  • 9
  • 65
  • 75
46

built-in types in Python have built in names:

>>> s = "hallo"
>>> type(s) is str
True
>>> s = {}
>>> type(s) is dict
True

btw note the is operator. However, type checking (if you want to call it that) is usually done by wrapping a type-specific test in a try-except clause, as it's not so much the type of the variable that's important, but whether you can do a certain something with it or not.

Albert Visser
  • 1,124
  • 6
  • 5
23

isinstance is preferrable over type because it also evaluates as True when you compare an object instance with it's superclass, which basically means you won't ever have to special-case your old code for using it with dict or str subclasses.

For example:

 >>> class a_dict(dict):
 ...     pass
 ... 
 >>> type(a_dict()) == type(dict())
 False
 >>> isinstance(a_dict(), dict)
 True
 >>> 

Of course, there might be situations where you wouldn't want this behavior, but those are –hopefully– a lot less common than situations where you do want it.

Dirk Stoop
  • 3,080
  • 21
  • 18
8

I think I will go for the duck typing approach - "if it walks like a duck, it quacks like a duck, its a duck". This way you will need not worry about if the string is a unicode or ascii.

Here is what I will do:

In [53]: s='somestring'

In [54]: u=u'someunicodestring'

In [55]: d={}

In [56]: for each in s,u,d:
    if hasattr(each, 'keys'):
        print list(set(each.values()))
    elif hasattr(each, 'lower'):
        print [each]
    else:
        print "error"
   ....:         
   ....:         
['somestring']
[u'someunicodestring']
[]

The experts here are welcome to comment on this type of usage of ducktyping, I have been using it but got introduced to the exact concept behind it lately and am very excited about it. So I would like to know if thats an overkill to do.

JV.
  • 2,658
  • 4
  • 24
  • 36
  • 6
    It seems likely that this could potentially yield false positives -- if we're worried about that kind of thing. ie... my 'Piano' class also has 'keys' – nakedfanatic Dec 19 '08 at 01:46
  • 1
    depends on the dataset, if i know i just have dictionaries and strings(unicode or ascii), then it shall work flawless. Yes, in a grand sense of things, you are correct in saying that it might lead to false positives. – JV. Dec 19 '08 at 02:12
  • It seems to me that this example assumes that strs and unicodes walk and quack the same. That is not the case. If you changed print [each] to print [s + each] you would see an example where they quack differently... – GreenAsJade Oct 16 '13 at 08:36
  • A useful walkthru of "duck typing" in this context: http://canonical.org/~kragen/isinstance/ – Carl Aug 14 '16 at 12:17
  • 1
    Answer++. Duck typing is THE way, or the power of a interpreted language will be completely lost. – Marco Sulla Sep 16 '16 at 10:38
  • This way is preferred to isinstance as you don't know how someone is going to interact with your code (especially for libraries). isinstance will cause you problems if you need to mock an object for testing, for example. I wish this were upvoted more! – Aaron D May 14 '19 at 21:49
7

I think it might be preferred to actually do

if isinstance(x, str):
    do_something_with_a_string(x)
elif isinstance(x, dict):
    do_somethting_with_a_dict(x)
else:
    raise ValueError

2 Alternate forms, depending on your code one or the other is probably considered better than that even. One is to not look before you leap

try:
  one, two = tupleOrValue
except TypeError:
  one = tupleOrValue
  two = None

The other approach is from Guido and is a form of function overloading which leaves your code more open ended.

http://www.artima.com/weblogs/viewpost.jsp?thread=155514

S.Lott
  • 384,516
  • 81
  • 508
  • 779
Ed.
  • 449
  • 2
  • 6
  • 9
3

That should work - so no, there is nothing wrong with your code. However, it could also be done with a dict:

{type(str()): do_something_with_a_string,
 type(dict()): do_something_with_a_dict}.get(type(x), errorhandler)()

A bit more concise and pythonic wouldn't you say?


Edit.. Heeding Avisser's advice, the code also works like this, and looks nicer:

{str: do_something_with_a_string,
 dict: do_something_with_a_dict}.get(type(x), errorhandler)()
nakedfanatic
  • 3,108
  • 2
  • 28
  • 33
3

You may want to check out typecheck. http://pypi.python.org/pypi/typecheck

Type-checking module for Python

This package provides powerful run-time typechecking facilities for Python functions, methods and generators. Without requiring a custom preprocessor or alterations to the language, the typecheck package allows programmers and quality assurance engineers to make precise assertions about the input to, and output from, their code.

Paul Hildebrandt
  • 2,724
  • 28
  • 26
1

I've been using a different approach:

from inspect import getmro
if (type([]) in getmro(obj.__class__)):
    # This is a list, or a subclass of...
elif (type{}) in getmro(obj.__class__)):
    # This one is a dict, or ...

I can't remember why I used this instead of isinstance, though...

Matthew Schinckel
  • 35,041
  • 6
  • 86
  • 121
-2

*sigh*

No, typechecking arguments in python is not necessary. It is never necessary.

If your code accepts either a string or a dict object, your design is broken.

That comes from the fact that if you don't know already the type of an object in your own program, then you're doing something wrong already.

Typechecking hurts code reuse and reduces performance. Having a function that performs different things depending on the type of the object passed is bug-prone and has a behavior harder to understand and maintain.

You have the following saner options:

1) Make a function unique_values that converts dicts in unique lists of values:

def unique_values(some_dict):
    return list(set(some_dict.values()))

Make your function assume the argument passed is always a list. That way, if you need to pass a string to the function, you just do:

myfunction([some_string])

If you need to pass it a dict, you do:

myfunction(unique_values(some_dict))

That's your best option, it is clean, easy to understand and maintain. Anyone reading the code immediatelly understands what is happening, and you don't have to typecheck.

2) Make two functions, one that accepts lists of strings and one that accepts dicts. You can make one call the other internally, in the most convenient way (myfunction_dict can create a list of strings and call myfunction_list).

In any case, don't typecheck. It is completely unnecessary and has only downsides. Refactor your code instead in a way you don't need to typecheck. You only get benefits in doing so, both in short and long run.

nosklo
  • 217,122
  • 57
  • 293
  • 297
  • 7
    +1 for a solid comment. Personally I stay away from all kinds of typechecking. If I need to, I prefer "Its easier to ask for forgiveness than permission". I `try` an operation and `except` the error. Never do `if` then this `else` this. – Jeffrey Jose Feb 22 '10 at 19:12
  • 26
    Typechecking can be useful when you are writing things like functions that are exposed over RPC — it can be helpful to check that you indeed got an int and a string like you expected, as part of a larger practice of thoroughly checking everything about external untrusted input. – Brandon Rhodes Mar 15 '11 at 18:09
  • 1
    @Brandon Craig Rhodes: parameters to functions exposed over RPC are checked automatically by the RPC framework you're using. External input is *always* bytestrings, so you don't have to check at all. Just convert it using `int()` and you'll get an error automatically if that is not possible. – nosklo Mar 15 '11 at 22:15
  • 5
    Several popular Python RPC libraries perform no type checking; for example, SimpleXMLRPCServer. When using such libraries, a typechecking library like "typecheck" can save lots of isinstance() code! – Brandon Rhodes Mar 16 '11 at 22:46
  • 11
    One is not always in control of what types are given to a certain function, especially when it is interfacing with other scripts. – Adam S Dec 15 '11 at 00:07
  • 3
    You should tell Peter Norvig that the design of his Lisp interpreter is broken http://norvig.com/lispy.html – codebox Jul 29 '12 at 15:43
  • 45
    "No, typechecking arguments in python is not necessary. It is never necessary." Wrong. So fundamentally wrong it hurts. 'nuff said. – Jürgen A. Erhard Jan 08 '13 at 18:09
  • 2
    @JürgenA.Erhard That's easy to say without providing an example where they are necessary – nosklo Jan 11 '13 at 19:45
  • 11
    I really should be wise enough not to bother, but riddle me this: why, if it's *never* necessary (your exact words) does Python have "isinstance"? Just for laughs? EOD for me. – Jürgen A. Erhard Jan 13 '13 at 00:21
  • 1
    @JürgenA.Erhard python is not the perfect language, it is full of mistakes – nosklo Jan 17 '13 at 20:02
  • 55
    The classic case for having something which might be either a string or a dictionary is when you're building a tree. At any point, a left hand subtree might be a terminal (ie. a string) or might be a further sub-tree (ie. a dictionary). Any function which recurses down such a tree needs to take an argument which is either a string or a dictionary, and will need to test which it is in order to deal with it intelligently. I'd like to see how you'd handle this situation without type-checking, @nosklo – interstar Feb 05 '13 at 05:17
  • 2
    +1 for "you need to know the type of a thing" when you're builing a tree or if you are recursing a structure - for example making a deep copy.... – GreenAsJade Sep 26 '13 at 09:27
  • 2
    @interstar each node of the tree could have a boolean attribute which would be True if the node is a leaf node. That way I can have any type in the tree (not just strings/dicts) and the code just have to read that attribute to know if the subtree goes down further or not. – nosklo Oct 17 '13 at 11:23
  • @interstar another solution would be to have fixed schema trees. That way code will know beforehand how deep is each branch of the tree, without having to check. It depends on your use case. Use your imagination and you won't need to typecheck. – nosklo Oct 17 '13 at 11:24
  • @GreenAsJade deeply copying something is frowned upon. Remeber the zen of python, *flat is better than nested* – nosklo Oct 17 '13 at 11:26
  • 4
    @nosko Your logic is "flat is better than nested therefore typechecking is *never* necessary" is flawed. Just because something is better than something else doesn't mean that the something else is never necssary. I would be curious to see an argument that a data structure and algorithms that result in a deep copy being required are "never necessary" – GreenAsJade Oct 22 '13 at 22:34
  • @GreenAsJade I believe it can be rewritten as a flat structure algorithm, so that you won't have to typecheck, all times. Which algorithm are you refering to, that can't be rewritten to use flat structure? – nosklo Nov 12 '13 at 05:21
  • 1
    @nosko: You are trying to prove that a generalisation ("something is *never* necessary") is true by asking me to come up with an example of a specific instance that you will attempt to refute. That will be futile: I may or may not be able to come up with such an example. I was curious to see the argument that demonstrates it is *never* necessary (your emphasis on never). – GreenAsJade Nov 13 '13 at 07:01
  • 5
    @nosko: FWIW, how about dealing with JSON? If I parse a JSON file I get a nested datastructure... would you really somehow "flatten" this before you start using it? – GreenAsJade Nov 13 '13 at 07:46
  • 3
    Seriously, you're going for the NetNews "you are an idiot" answer? Please, let's stay in the 21st Century. Just like GreenAsJade points out, what if you're using json.JSONDecoder.decode(), which returns completely different types based on the JSON string? I ran into this question precisely because of json.JSONDecoder.decode(). – Mark Gerolimatos Oct 16 '14 at 02:45
  • I use type checking in python to parse a list in place. I have a list of strings, and I parse them into objects as I need them. Call it "Lazy Parsing." Then when I need to use one the items in the list (or dictionary, I've done that too) I type check to see if it has been parsed yet. If not, parse it and replace the string with the parsed value. It's faster than just automatically parsing all the strings. – Ryan Nov 27 '14 at 01:20
  • 1
    `plistlib` would like to have a `word` with you –  Dec 21 '15 at 17:28
  • 3
    get off your hipster soapbox; there are plenty of reasons why someone might need to do this (e.g. backward compatibility), but none of them are your business. – tpow Nov 19 '16 at 03:13