17

Background:
I mostly run python scripts from the command line in pipelines and so my arguments are always strings that need to be type casted to the appropriate type. I make a lot of little scripts each day and type casting each parameter for every script takes more time than it should.

Question:
Is there a canonical way to automatically type cast parameters for a function?

My Way:
I've developed a decorator to do what I want if there isn't a better way. The decorator is the autocast fxn below. The decorated fxn is fxn2 in the example. Note that at the end of the code block I passed 1 and 2 as strings and if you run the script it will automatically add them. Is this a good way to do this?

def estimateType(var):
    #first test bools
    if var == 'True':
            return True
    elif var == 'False':
            return False
    else:
            #int
            try:
                    return int(var)
            except ValueError:
                    pass
            #float
            try:
                    return float(var)
            except ValueError:
                    pass
            #string
            try:
                    return str(var)
            except ValueError:
                    raise NameError('Something Messed Up Autocasting var %s (%s)' 
                                      % (var, type(var)))

def autocast(dFxn):
    '''Still need to figure out if you pass a variable with kw args!!!
    I guess I can just pass the dictionary to the fxn **args?'''
    def wrapped(*c, **d):
            print c, d
            t = [estimateType(x) for x in c]
            return dFxn(*t)
    return wrapped

@autocast
def fxn2(one, two):

   print one + two 

fxn2('1', '2')      

EDIT: For anyone that comes here and wants the updated and concise working version go here:

https://github.com/sequenceGeek/cgAutoCast

And here is also quick working version based on above:

def boolify(s):
    if s == 'True' or s == 'true':
            return True
    if s == 'False' or s == 'false':
            return False
    raise ValueError('Not Boolean Value!')

def estimateType(var):
    '''guesses the str representation of the variables type'''
    var = str(var) #important if the parameters aren't strings...
    for caster in (boolify, int, float):
            try:
                    return caster(var)
            except ValueError:
                    pass
    return var

def autocast(dFxn):
    def wrapped(*c, **d):
            cp = [estimateType(x) for x in c]
            dp = dict( (i, estimateType(j)) for (i,j) in d.items())
            return dFxn(*cp, **dp)

    return wrapped

######usage######
@autocast
def randomFunction(firstVar, secondVar):
    print firstVar + secondVar

randomFunction('1', '2')
sequenceGeek
  • 767
  • 1
  • 8
  • 20
  • 2
    It's "cast," not "caste" – NullUserException Aug 10 '11 at 23:44
  • Why are you trying to cast it to string? You're assuming it's a string, as `var == "True"` and `var == "False"` will give unexpected results for any other type. – agf Aug 10 '11 at 23:50
  • @agf That won't cast it to string; Python is not statically typed (the OP wants to automatically convert a string to some other value based on its contents) – li.davidm Aug 10 '11 at 23:52
  • The line `str(var)` tries to cast it to a string, but this is after he's already assumed it's a string. For example `1 == True` or `0 == False` will evaluate to `True`, which isn't what he wants. Therefore, he should either check if it's a string before making those comparisons, or he should assume it's a string and just `return var` if it's not an `int` or `float`. – agf Aug 10 '11 at 23:56
  • FWIW `estimateType`, at first glance, appears to only coerce types if it can detect `var`'s type. But you end up always getting a string. In my implementation I substituted that first internal assignment of `var` to use a temp variable so if no type can be estimated I'm at least given back the original. – Trindaz Jan 10 '14 at 08:11

6 Answers6

20

If you want to auto-convert values:

def boolify(s):
    if s == 'True':
        return True
    if s == 'False':
        return False
    raise ValueError("huh?")

def autoconvert(s):
    for fn in (boolify, int, float):
        try:
            return fn(s)
        except ValueError:
            pass
    return s

You can adjust boolify to accept other boolean values if you like.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 1
    +1. I think you could also use a map for boolify to simplify that code a bit. E.g., replace the body of `boolify` with `return {'True': True, 'False': False}[s]`. – Neil G Aug 11 '11 at 05:36
  • 1
    Yes, after I answered, I thought about switching to a map, but got distracted and wandered away.. Squirrel! – Ned Batchelder Aug 11 '11 at 12:09
  • What if someone does `autoconvert('0')`, hoping it is interpreted as an int? Won't it convert it to `False`? – Aaron Esau Feb 23 '18 at 02:17
9

You could just use plain eval to input string if you trust the source:

>>> eval("3.2", {}, {})
3.2
>>> eval("True", {}, {})
True

But if you don't trust the source, you could use literal_eval from ast module.

>>> ast.literal_eval("'hi'")
'hi'
>>> ast.literal_eval("(5, 3, ['a', 'b'])")
(5, 3, ['a', 'b'])

Edit: As Ned Batchelder's comment, it won't accept non-quoted strings, so I added a workaround, also an example about autocaste decorator with keyword arguments.

import ast

def my_eval(s):
    try:
        return ast.literal_eval(s)
    except ValueError: #maybe it's a string, eval failed, return anyway
        return s       #thanks gnibbler

def autocaste(func):
    def wrapped(*c, **d):
        cp = [my_eval(x) for x in c]
        dp = {i: my_eval(j) for i,j in d.items()} #for Python 2.6+
        #you can use dict((i, my_eval(j)) for i,j in d.items()) for older versions
        return func(*cp, **dp)

    return wrapped

@autocaste
def f(a, b):
    return a + b

print(f("3.4", "1")) # 4.4
print(f("s", "sd"))  # ssd
print(my_eval("True")) # True
print(my_eval("None")) # None
print(my_eval("[1, 2, (3, 4)]")) # [1, 2, (3, 4)]
utdemir
  • 26,532
  • 10
  • 62
  • 81
  • This won't accept simple strings on the command line: they'd all have to be quoted. – Ned Batchelder Aug 10 '11 at 23:54
  • @Ned Batchelder, you're right, added a workaround, but not seems so good. – utdemir Aug 11 '11 at 00:06
  • Instead of quoting the string, you could just `try` doing a `literal_eval` and return the original string if there is an `exception` – John La Rooy Aug 11 '11 at 00:08
  • I'm curious about two things: speed and being able to evaluate lists. The power of being able passing a list from the command line begets worry in my mind, but I guess I have to accept that if I'm going to be autocasting in the first place. The other question is whether if I use a fxn that has been decorated and I run that function a lot, will it significantly slow down? I wonder how fast eval is compared to a fxn like Ned's below that only handles basic types? – sequenceGeek Aug 11 '11 at 00:31
  • Yes, it really depends on speed you're expecting, but using typecasting like `int(s)` is a lot faster than eval(maybe 20-30x). And If you want only basic types, and speed is an important criteria, [Ned Batchelder's answer](http://stackoverflow.com/questions/7019283/automatically-type-cast-parameters-in-python/7019382#7019382) fits best for you I think. – utdemir Aug 11 '11 at 00:44
6

I'd imagine you can make a type signature system with a function decorator, much like you have, only one that takes arguments. For example:

@signature(int, str, int)
func(x, y, z):
    ...

Such a decorator can be built rather easily. Something like this (EDIT -- works!):

def signature(*args, **kwargs):
    def decorator(fn):
        def wrapped(*fn_args, **fn_kwargs):
            new_args = [t(raw) for t, raw in zip(args, fn_args)]
            new_kwargs = dict([(k, kwargs[k](v)) for k, v in fn_kwargs.items()])

            return fn(*new_args, **new_kwargs)

        return wrapped

    return decorator

And just like that, you can now imbue functions with type signatures!

@signature(int, int)
def foo(x, y):
    print type(x)
    print type(y)
    print x+y

>>> foo('3','4')
<type: 'int'>
<type: 'int'>
7

Basically, this is an type-explicit version of @utdemir's method.

martineau
  • 119,623
  • 25
  • 170
  • 301
Andrew Lee
  • 2,543
  • 3
  • 20
  • 31
  • Now that's an interesting idea too! I would rather automatically cast if I can because it will save time, but your way would be much "safer" – sequenceGeek Aug 11 '11 at 00:00
  • @sequenceGeek -- that's right, in fact looking at the other answers this is basically an explicit version of utdemir's method. – Andrew Lee Aug 11 '11 at 00:24
  • Nice idea! Just note that current implementation might result in error/less args being sent to 'func' in the case of wanting to evaluate less arguments. e.g. "@signature(int) func(x,y,z)". – Josejulio Sep 11 '17 at 04:16
  • Nice idea! Would like to point out that this can also be used on an _existing_ function by "manually" applying the decorator to it: i.e. `existing_func = signature(int, str)(existing_func)`. – martineau May 22 '20 at 17:57
2

If you're parsing arguments from the command line, you should use the argparse module (if you're using Python 2.7).

Each argument can have an expected type so knowing what to do with it should be relatively straightforward. You can even define your own types.

...quite often the command-line string should instead be interpreted as another type, like a float or int. The type keyword argument of add_argument() allows any necessary type-checking and type conversions to be performed. Common built-in types and functions can be used directly as the value of the type argument:

parser = argparse.ArgumentParser()
parser.add_argument('foo', type=int)
parser.add_argument('bar', type=file)
parser.parse_args('2 temp.txt'.split())
>>> Namespace(bar=<open file 'temp.txt', mode 'r' at 0x...>, foo=2)
John Lyon
  • 11,180
  • 4
  • 36
  • 44
  • 1
    Can the downvoter explain themselves? How is this not better than hacking with `eval`? The question clearly says *"I mostly run python scripts from the command line in pipelines and so my arguments are always strings that need to be type casted."*, and this answer will do that exactly. – John Lyon Jan 24 '13 at 03:20
  • 2
    Not the downvoter, but probably because writing an argument parser is about as much effort as manually casting arguments. Argparse is good for stuff you write once and use often, but for one off uses it's overkill. – hoodakaushal Jun 22 '19 at 11:58
0

I know I arrived late at this game, but how about eval?

def my_cast(a):
try:
    return eval(a)
except:
    return a

or alternatively (and more safely):

from ast import literal_eval

def mycast(a):
  try:
    return literal_eval(a)
  except:
    return a
cmantas
  • 1,516
  • 14
  • 14
  • 1
    Using `eval` might not be a good idea-- If you try to do `my_cast` on user-supplied input, they could potentially get remote code execution by inputting the right payload. – Aaron Esau Feb 23 '18 at 02:18
0

There are couple of problems in your snippet.

#first test bools
if var == 'True':
        return True
elif var == 'False':
        return False

This would always check for True because you are testing against the strings 'True' and 'False'.

There is not an automatic coercion of types in python. Your arguments when you receive via *args and **kwargs can be anything. First will look for list of values (each of which can be any datatype, primitive and complex) and second will look for a mapping (with any valid mapping possible). So if you write a decorator, you will end up with a good list of error checks.

Normally, if you wish to send in str, just when the function is invoked, typecast it to string via (str) and send it.

Senthil Kumaran
  • 54,681
  • 14
  • 94
  • 131