Python : if any(x in y), case and accent insensitive?

Question

I have the following list of strings :

exclude = ['eee', 'iii']

I have a word to be tested :

word = 'Iîïe'

I want the following test to be true :

if any(x in word for x in exclude):
    #I want to be here !

In order to be true, my condition needs to be case-insensitive and accent-insensitive... How?

`in` just does substring searching. Therefore it's case and accent sensitive — OneCricketeer, Oct 02 '16 at 13:16
I have no idea how to do accent insensitivity, you would have to define that further. For case insensitivity `any([x in world.lower() for x in exclude])` would do the job. Are you sure you don't mean `exclude = ['e', 'i']`? — timakro, Oct 02 '16 at 13:16
See this [answer](http://stackoverflow.com/a/29247821/4099593). — Bhargav Rao, Oct 02 '16 at 13:47

score 1 · Accepted Answer · answered Oct 02 '16 at 13:16

1

You can use a third party package called unidecode:

What Unidecode provides is a middle road: function unidecode() takes Unicode data and tries to represent it in ASCII characters (i.e., the universally displayable characters between 0x00 and 0x7F), where the compromises taken when mapping between two character sets are chosen to be near what a human with a US keyboard would choose.

Example:

from unidecode import unidecode
...
if any(x in unidecode(word).lower() for x in exclude):
    ...

answered Oct 02 '16 at 13:16

Selcuk

57,004
12
102
110

Do you know how resource-intensive is it ? – Vincent Oct 02 '16 at 13:20
I have no idea, but you should be able to benchmark it using sample data from your use case. I also suggest you to read the "Performance notes" section in the project page. – Selcuk Oct 02 '16 at 13:22

Python : if any(x in y), case and accent insensitive?

1 Answers1