Remove strings from a list that contains numbers in python

Question

Is there a short way to remove all strings in a list that contains numbers?

For example

my_list = [ 'hello' , 'hi', '4tim', '342' ]

would return

my_list = [ 'hello' , 'hi']

Well this changes the question entirely – jamylak Apr 18 '13 at 14:03 — jamylak, Apr 18 '13 at 14:03

score 39 · Accepted Answer · answered Apr 18 '13 at 13:45

39

Without regex:

[x for x in my_list if not any(c.isdigit() for c in x)]

answered Apr 18 '13 at 13:45

eumiro

207,213
34
299
261

1

where do you find this misc useful functions like any()? – thavan Apr 18 '13 at 14:17
2

@thavan: http://docs.python.org/2/library/functions.html – eumiro Apr 18 '13 at 14:58
How would you do this if you had a data frame? – Laura Jun 09 '20 at 19:13
could someone explain c.isdigit part? – amarykya_ishtmella Jan 03 '21 at 22:27

score 7 · Answer 2 · answered Apr 18 '13 at 13:46

7

I find using isalpha() the most elegant, but it will also remove items that contain other non-alphabetic characters:

Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”

my_list = [item for item in my_list if item.isalpha()]

answered Apr 18 '13 at 13:46

Adam

15,537
2
42
63

1

He wants to remove strings with numbers, but special characters (spaces, punctuation,…) are probably allowed. – eumiro Apr 18 '13 at 13:46
Except it won't work for punctuation – jamylak Apr 18 '13 at 13:48
That's correct. I still thought I'd include it because it *will* work for many scenarios. – Adam Apr 18 '13 at 13:49
1

And written "old style", which works (IMHO) for readability in this case (and if run on 2.x) is, `filter(str.isalpha, my_list)` – Jon Clements Apr 18 '13 at 13:50
That's not old style! I would do it that way – jamylak Apr 18 '13 at 13:51
1

The problem with this is that it also removes characters åäö, but that is exactly what I want to do, remove all non characters – user1506145 Apr 18 '13 at 13:52
1

@user1506145 - then define it in your question please. – eumiro Apr 18 '13 at 13:52
1

@user1506145 it will work just fine if you encode them in unicode, i.e. by using the `u` prefix as in `u'åääö'`. In Python 3, all strings are unicode and this is not an issue. – Adam Apr 18 '13 at 13:57

mgilson · Answer 3 · 2013-04-18T14:07:25.157

I'd use a regex:

import re
my_list = [s for s in my_list if not re.search(r'\d',s)]

In terms of timing, using a regex is significantly faster on your sample data than the isdigit solution. Admittedly, it's slower than isalpha, but the behavior is slightly different with punctuation, whitespace, etc. Since the problem doesn't specify what should happen with those strings, it's not clear which is the best solution.

import re

my_list = [ 'hello' , 'hi', '4tim', '342' 'adn322' ]
def isalpha(mylist):
    return [item for item in mylist if item.isalpha()]

def fisalpha(mylist):
    return filter(str.isalpha,mylist)

def regex(mylist,myregex = re.compile(r'\d')):
    return [s for s in mylist if not myregex.search(s)]

def isdigit(mylist):
    return [x for x in mylist if not any(c.isdigit() for c in x)]

import timeit
for func in ('isalpha','fisalpha','regex','isdigit'):
    print func,timeit.timeit(func+'(my_list)','from __main__ import my_list,'+func)

Here are my results:

isalpha 1.80665302277
fisalpha 2.09064006805
regex 2.98224401474
isdigit 8.0824341774

wow that's surprising, it must be better for larger inputs though — jamylak, Apr 18 '13 at 14:08

score 1 · Answer 4 · answered Apr 18 '13 at 13:45

1

Try:

import re
my_list = [x for x in my_list if re.match("^[A-Za-z_-]*$", x)]

answered Apr 18 '13 at 13:45

Pablo Santa Cruz

176,835
32
241
292

Where did you get this predefined character set? – jamylak Apr 18 '13 at 13:48

score 0 · Answer 5 · answered Apr 18 '13 at 13:47

Sure, use the string builtin for digits, and test the existence of them. We'll get a little fancy and just test for truthiness in the list comprehension; if it's returned anything there's digits in the string.

So:

out_list = []
for item in my_list:
    if not [ char for char in item if char in string.digits ]:
        out_list.append(item)

score 0 · Answer 6 · answered Apr 18 '13 at 13:56

0

And yet another slight variation:

>>> import re
>>> filter(re.compile('(?i)[a-z]').match, my_list)
['hello', 'hi']

And put the characters that are valid in your re (such as spaces/punctuation/other)

answered Apr 18 '13 at 13:56

Jon Clements

138,671
33
247
280

Remove strings from a list that contains numbers in python

6 Answers6

Linked

Related