210

I love using the expression

if 'MICHAEL89' in USERNAMES:
    ...

where USERNAMES is a list.


Is there any way to match items with case insensitivity or do I need to use a custom method? Just wondering if there is a need to write extra code for this.

Georgy
  • 12,464
  • 7
  • 65
  • 73
RadiantHex
  • 24,907
  • 47
  • 148
  • 244

12 Answers12

250
username = 'MICHAEL89'
if username.upper() in (name.upper() for name in USERNAMES):
    ...

Alternatively:

if username.upper() in map(str.upper, USERNAMES):
    ...

Or, yes, you can make a custom method.

nmichaels
  • 49,466
  • 12
  • 107
  • 135
  • 9
    `if 'CaseFudge'.lower() in [x.lower() for x in list]` – fredley Sep 02 '10 at 14:00
  • 54
    `[...]` creates the whole list. `(name.upper() for name in USERNAMES)` would create only a generator and one needed string at a time - massive memory savings if you're doing this operation a lot. (even more savings, if you simply create a list of lowercase usernames that you reuse for checking every time) – viraptor Sep 02 '10 at 14:06
  • 2
    Prefer to lower all keys when building the dict, for performance reasons. – Ryan May 01 '13 at 06:27
  • 1
    if [x.lower() for x in list] is a list comprehension, is (name.upper() for name in USERNAMES) a tuple comprehension? Or does it have another name? – otocan Apr 19 '18 at 08:48
  • 1
    @otocan It's a generator expression. – nmichaels Apr 19 '18 at 13:13
  • @nmichaels thanks, just wanted to know what to google – otocan Apr 20 '18 at 11:12
49

str.casefold is recommended for case-insensitive string matching. @nmichaels's solution can trivially be adapted.

Use either:

if 'MICHAEL89'.casefold() in (name.casefold() for name in USERNAMES):

Or:

if 'MICHAEL89'.casefold() in map(str.casefold, USERNAMES):

As per the docs:

Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string. For example, the German lowercase letter 'ß' is equivalent to "ss". Since it is already lowercase, lower() would do nothing to 'ß'; casefold() converts it to "ss".

jpp
  • 159,742
  • 34
  • 281
  • 339
21

I would make a wrapper so you can be non-invasive. Minimally, for example...:

class CaseInsensitively(object):
    def __init__(self, s):
        self.__s = s.lower()
    def __hash__(self):
        return hash(self.__s)
    def __eq__(self, other):
        # ensure proper comparison between instances of this class
        try:
           other = other.__s
        except (TypeError, AttributeError):
          try:
             other = other.lower()
          except:
             pass
        return self.__s == other

Now, if CaseInsensitively('MICHAEL89') in whatever: should behave as required (whether the right-hand side is a list, dict, or set). (It may require more effort to achieve similar results for string inclusion, avoid warnings in some cases involving unicode, etc).

Venkatesh Bachu
  • 2,348
  • 1
  • 18
  • 28
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 3
    that doesn't work for dict try if CaseInsensitively('MICHAEL89') in {'Michael89':True}:print "found" – Xavier Combelle Sep 02 '10 at 14:56
  • 2
    Xavier: You would need `CaseInsensitively('MICHAEL89') in {CaseInsensitively('Michael89'):True}` for that to work, which probably doesn't fall under "behave as required". – Gabe Sep 02 '10 at 15:07
  • So much for there being only 1 obvious way to do it. This feels heavy unless it's going to be used a lot. That said, it's very smooth. – nmichaels Sep 02 '10 at 17:56
  • 2
    @Nathon, it seems to me that having to invasively alter the container is the "feels heavy" operation. A completely non-invasive wrapper: how much "lighter" than this could one get?! Not much;-). @Xavier, RHS's that are dicts or sets with mixed-case keys/items need their own non-invasive wrappers (part of the short `etc.` and "require more effort" parts of my answer;-). – Alex Martelli Sep 02 '10 at 18:14
  • My definition of heavy involves writing quite a bit of code to make something that will only be used once, where a less robust but much shorter version would do. If this is going to be used more than once, it's perfectly sensible. – nmichaels Sep 02 '10 at 18:35
15

Usually (in oop at least) you shape your object to behave the way you want. name in USERNAMES is not case insensitive, so USERNAMES needs to change:

class NameList(object):
    def __init__(self, names):
        self.names = names

    def __contains__(self, name): # implements `in`
        return name.lower() in (n.lower() for n in self.names)

    def add(self, name):
        self.names.append(name)

# now this works
usernames = NameList(USERNAMES)
print someone in usernames

The great thing about this is that it opens the path for many improvements, without having to change any code outside the class. For example, you could change the self.names to a set for faster lookups, or compute the (n.lower() for n in self.names) only once and store it on the class and so on ...

Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
10

Here's one way:

if string1.lower() in string2.lower(): 
    ...

For this to work, both string1 and string2 objects must be of type string.

User
  • 23,729
  • 38
  • 124
  • 207
6

I think you have to write some extra code. For example:

if 'MICHAEL89' in map(lambda name: name.upper(), USERNAMES):
   ...

In this case we are forming a new list with all entries in USERNAMES converted to upper case and then comparing against this new list.

Update

As @viraptor says, it is even better to use a generator instead of map. See @Nathon's answer.

Community
  • 1
  • 1
Manoj Govindan
  • 72,339
  • 21
  • 134
  • 141
  • Or you could use `itertools` function `imap`. It's much faster than a generator but accomplishes the same goal. – wheaties Sep 02 '10 at 14:24
5

You could do

matcher = re.compile('MICHAEL89', re.IGNORECASE)
filter(matcher.match, USERNAMES) 

Update: played around a bit and am thinking you could get a better short-circuit type approach using

matcher = re.compile('MICHAEL89', re.IGNORECASE)
if any( ifilter( matcher.match, USERNAMES ) ):
    #your code here

The ifilter function is from itertools, one of my favorite modules within Python. It's faster than a generator but only creates the next item of the list when called upon.

wheaties
  • 35,646
  • 15
  • 94
  • 131
  • Just to add, the pattern might need to be escaped, since it might contain characters like ".","?", which has specail meaning in regular expression patterns. use re.escape(raw_string) to do it – Iching Chang Jan 08 '17 at 23:29
1

To have it in one line, this is what I did:

if any(([True if 'MICHAEL89' in username.upper() else False for username in USERNAMES])):
    print('username exists in list')

I didn't test it time-wise though. I am not sure how fast/efficient it is.

MFA
  • 537
  • 2
  • 6
  • 16
  • if you want to improve this time-wise: make it a single generator expression in the `any`. You currently have a list comprehension in a generator comprehension in the any call. Also the ternary `True if ... else False` that would also result `True` for a username like `abc_michael89xyz`! – ewerybody Feb 09 '23 at 15:54
  • I'd suggest `if any(name == username.upper() for username in USERNAMES):` – ewerybody Feb 09 '23 at 15:54
  • you are right about `abc_michael89xyz`, but I thought this is exactly the case that it should return `True` and exact match is not important – MFA May 04 '23 at 07:50
  • `a == b` can only yield `True` or `False` :) – ewerybody May 05 '23 at 09:07
  • Ah now I know what you mean! But that's not what OP wanted! However this could just be `if any(name in username.upper() for username in USERNAMES):` Voilà! – ewerybody May 05 '23 at 09:11
  • The cool thing about this solution compared to most of the others is: That if done correctly `any` will already **end the loop** as soon as a first `True` is found! So this is not only performance but also memory friendly as there is no additional list created ad hoc – ewerybody May 05 '23 at 09:13
1

Example from this tutorial:

list1 = ["Apple", "Lenovo", "HP", "Samsung", "ASUS"]

s = "lenovo"
s_lower = s.lower()

res = s_lower in (string.lower() for string in list1)

print(res)
pyjavo
  • 1,598
  • 2
  • 23
  • 41
0

My 5 (wrong) cents

'a' in "".join(['A']).lower()

UPDATE

Ouch, totally agree @jpp, I'll keep as an example of bad practice :(

GBrian
  • 1,031
  • 11
  • 28
  • 2
    This is wrong. Consider `'a' in "".join(['AB']).lower()` returns `True` when this isn't what OP wants. – jpp Apr 10 '19 at 19:36
0

I needed this for a dictionary instead of list, Jochen solution was the most elegant for that case so I modded it a bit:

class CaseInsensitiveDict(dict):
    ''' requests special dicts are case insensitive when using the in operator,
     this implements a similar behaviour'''
    def __contains__(self, name): # implements `in`
        return name.casefold() in (n.casefold() for n in self.keys())

now you can convert a dictionary like so USERNAMESDICT = CaseInsensitiveDict(USERNAMESDICT) and use if 'MICHAEL89' in USERNAMESDICT:

Megarushing
  • 418
  • 7
  • 13
0

I am using Pyhton 3.10.5

Suppose I have a list

USERNAMES = ['joy23', 'michael89', 'rony224', 'samrat445']

Now if I want to check if 'michael89' is on the list without considering the case, The following code works:

'michael89'.casefold() in USERNAMES

The output will be true.

Again if I want to check if 'MICHAEL89' is on the list without considering the case, The code is:

'MICHAEL89'.casefold() in USERNAMES

The output will also be true.

'miCHAel89'.casefold() in USERNAMES

This returns true again. Example of the previous explanation

So the main catch here is the USERNAMES list should only contain lowercase letters. If you save all the items of USERNAMES in lowercase letters. You can simply solve the problem using:

if 'MICHAEL89'.casefold() in USERNAMES:
    ......
Joy Karmoker
  • 139
  • 13