Is there a short way to remove all strings in a list that contains numbers?
For example
my_list = [ 'hello' , 'hi', '4tim', '342' ]
would return
my_list = [ 'hello' , 'hi']
Is there a short way to remove all strings in a list that contains numbers?
For example
my_list = [ 'hello' , 'hi', '4tim', '342' ]
would return
my_list = [ 'hello' , 'hi']
I find using isalpha()
the most elegant, but it will also remove items that contain other non-alphabetic characters:
Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”
my_list = [item for item in my_list if item.isalpha()]
I'd use a regex:
import re
my_list = [s for s in my_list if not re.search(r'\d',s)]
In terms of timing, using a regex is significantly faster on your sample data than the isdigit solution. Admittedly, it's slower than isalpha
, but the behavior is slightly different with punctuation, whitespace, etc. Since the problem doesn't specify what should happen with those strings, it's not clear which is the best solution.
import re
my_list = [ 'hello' , 'hi', '4tim', '342' 'adn322' ]
def isalpha(mylist):
return [item for item in mylist if item.isalpha()]
def fisalpha(mylist):
return filter(str.isalpha,mylist)
def regex(mylist,myregex = re.compile(r'\d')):
return [s for s in mylist if not myregex.search(s)]
def isdigit(mylist):
return [x for x in mylist if not any(c.isdigit() for c in x)]
import timeit
for func in ('isalpha','fisalpha','regex','isdigit'):
print func,timeit.timeit(func+'(my_list)','from __main__ import my_list,'+func)
Here are my results:
isalpha 1.80665302277
fisalpha 2.09064006805
regex 2.98224401474
isdigit 8.0824341774
Try:
import re
my_list = [x for x in my_list if re.match("^[A-Za-z_-]*$", x)]
Sure, use the string builtin for digits, and test the existence of them. We'll get a little fancy and just test for truthiness in the list comprehension; if it's returned anything there's digits in the string.
So:
out_list = []
for item in my_list:
if not [ char for char in item if char in string.digits ]:
out_list.append(item)
And yet another slight variation:
>>> import re
>>> filter(re.compile('(?i)[a-z]').match, my_list)
['hello', 'hi']
And put the characters that are valid in your re (such as spaces/punctuation/other)