1

Looking to find a list of keywords inside a for loop, I'm getting this error:

TypeError: unhashable type: 'list'

An excerpt of my code is as follows:

key = ['lorem', 'ipsum', 'dolor']

for item in stringloop:
    matcher = re.compile(key, re.IGNORECASE)
    if filter(matcher.match, item):
       # Some code
Blacksun
  • 359
  • 1
  • 4
  • 16

2 Answers2

3

Starting with this:

stringloop = ['lorem 123', 'testfoo', 'dolor 456']
key = ['lorem', 'ipsum', 'dolor']

First, you need to match any one key. Use the | joining operator. x|y|z looks for x or y or z. Create the object outside the loop:

matcher = re.compile('|'.join(map(re.escape, key)), re.I) # escaping possible metacharacters

Here, I use re.escape to escape any possible regex metacharacters. May not work if your existing pattern has any meta characters. Now loop through stringloop, calling matcher.match on each item. Don't use filter, call it directly:

for item in stringloop:
    if matcher.match(item):
        print(item)

This gives:

lorem 123
dolor 456

For complicated patterns with their own meta characters, you should probably compile each pattern separately in a pattern list:

matchers = [re.compile(pat, re.I) for pat in key]

You'll then modify your loop slightly:

for item in stringloop:
    for m in matchers:    
        if m.match(item):
            print(item)
            break

This also works, giving:

lorem 123
dolor 456

But it is slower, because of the nested loop.


As a closing comment, if your keys are simple strings, I would just go with str.startswith, because that also does the same thing, checking if a string begins with a certain sub string:

for item in stringloop:
    if item.lower().startswith(tuple(key)):
        print(item)

Magically, this also gives:

lorem 123
dolor 456
cs95
  • 379,657
  • 97
  • 704
  • 746
  • @NullUserException Good catch. Fixed. – cs95 Jul 23 '17 at 20:13
  • I personally would not use regex for this, but this works – NullUserException Jul 23 '17 at 20:14
  • 1
    As a warning, I'm not 100% sure this would affect you, but there has been reports of other regex engines failing strangely with long sets of "or" operators. Not sure how many the OP may eventually have in his code, but you can see the SO article [here](https://stackoverflow.com/questions/44778723/performance-issue-with-grep-f) and another one [here](https://github.com/BurntSushi/ripgrep/issues/497#issuecomment-311719249) – CDahn Jul 23 '17 at 20:23
  • @Blacksun Simple. `matcher.match` looks only at the beginning. You need `matcher.search` for your code. – cs95 Jul 23 '17 at 21:01
  • @CDahn - I just posted an answer that will cure any alternation latency issues. See that link you reference https://stackoverflow.com/questions/44778723/performance-issue-with-grep-f –  Jul 23 '17 at 23:56
-1

I think what you're trying to do is the following:

key = ['lorem', 'ipsum', 'dolor']
finallist = []

for item in stringloop:
    for regex in key:
        if re.match(regex, item):
            finallist.append(item)
            # Some code

This uses each element of key as the regex to match against each string element in stringloop. As COLDSPEED noted, compiling in a loop for a single use defeats the purpose of compiling it at all, so just use them directly in match instead. Then, instead of filter, just build a finalized list in the loop itself.

CDahn
  • 1,795
  • 12
  • 23