1

Usually, to detect a string field, I can just check to see if the first char is a string. For example:

>>> [str(v)[0].isalpha() for v in ['first', 'last']]
[True, True]

However, sometimes I'll have database or other fields that are strings, but start with a number, for example "3D" is a field I've come across.

What would be the most performant way to check to see if all items in a list are strings?

Here are some examples:

['1.0', 'test', '3d', '123,000.00', '55']
> False, True, True, False, False

Basically, I want to know if a value can be stored as a varchar field or needs to be converted to a non-string field.

It would be something like:

values = ['1.0', 'test', '3d', '123,000.00', 55]
>>> [not re.sub(r'\,|\.', '', str(val)).isdigit() for val in values]
[False, True, True, False, False]

Are there better ways to do this?

David542
  • 104,438
  • 178
  • 489
  • 842
  • If you have a database, other than sqlite, shouldn't the type be pre-determined? – roganjosh Dec 20 '18 at 20:24
  • 2
    `isinstance(x, str)` or, if you don't care about subclasses and are more interested in performance, `type(x) is str` – juanpa.arrivillaga Dec 20 '18 at 20:25
  • @roganjosh this is to parse the data before creating the fields and see which fields I need to create. – David542 Dec 20 '18 at 20:25
  • 2
    not just letters can be strings. '1' is also a string. – Stef van der Zon Dec 20 '18 at 20:25
  • @juanpa.arrivillaga the fields are all strings. I want to see which should be treated as actual strings. – David542 Dec 20 '18 at 20:25
  • 1
    You haven't defined what "actual strings are". All those items *could* be stored as a varchar field, if oyu convert them to strings. – juanpa.arrivillaga Dec 20 '18 at 20:26
  • So flip it round and see what could be considered as something _other_ than a string maybe? – roganjosh Dec 20 '18 at 20:26
  • @roganjosh see updated question please. I've taken your suggestion, bu I was wondering if there are better solutions. – David542 Dec 20 '18 at 20:28
  • Hasn't this been answered here https://stackoverflow.com/questions/354038/how-do-i-check-if-a-string-is-a-number-float ? – ar7 Dec 20 '18 at 20:29
  • 1
    For each other data type, write a function `convert_to_float` that raises an exception if it can't convert the input. Then try to apply each of those to each value, using strings as the default. As an aside, this seems like the kind of problem that might be solved by having an ORM, you might want to look into that. – Patrick Haugh Dec 20 '18 at 20:30
  • The regex will be slow. There's only so much you can ask of python in terms of performance. I'm tempted to suggest a try/except for converting to float. – roganjosh Dec 20 '18 at 20:30
  • @PatrickHaugh followed you up to the ORM, we had the same suggestion till then. Why would an ORM necessarily be _efficient_ here? Correctness trumps efficiency. – roganjosh Dec 20 '18 at 20:34
  • @roganjosh My thinking with suggesting an ORM is that they're interacting with database types by looking at a value and trying to guess what type it is, while it may be more feasible to explicitly map fields to types in their code. I suppose I can't really say without knowing what their actual problem is though. – Patrick Haugh Dec 20 '18 at 20:39

1 Answers1

1

An efficient way would be to use the float() constructor in a try-except block since it uses the built-in test implemented in C. Remove occurrences of ',' first if you want to ignore separators of thousands:

def not_number(string):
    try:
        float(string.replace(',', ''))
    except ValueError:
        return True
    return False

so that:

values = ['1.0', 'test', '3d', '123,000.00', '55']
[not_number(value) for value in values]

returns:

[False, True, True, False, False]
blhsing
  • 91,368
  • 6
  • 71
  • 106