-1

I'm trying to code this Bigram, I've this code but it keeps given me:

counts[given][char] += 1 
 IndexError: list index out of range

I don't know how to handle it. Can anyone help me?

def pairwise(s):
    a,b = itertools.tee(s)
    next(b)
    return zip(a,b)
    counts = [[0 for _ in range(52)] for _ in range(52)]

with open('path/to/open') as file:
    for a,b in pairwise(char for line in file for word in line.split() for char in word):  
        given = ord(a) - ord('a')                                                            
        char = ord(b) - ord('a')                                                             

        counts[given][char] += 1

I get this error:

Traceback: counts[given][char] += 1 IndexError: list index out of range
Artjom B.
  • 61,146
  • 24
  • 125
  • 222
py.codan
  • 89
  • 1
  • 11

1 Answers1

1

Your counts variable is a local in the pairwise() function.

As such, trying to access counts as a global in the for loop will raise a NameError. But you instead silenced that exception with a blanket except. Don't do that. See Why is "except: pass" a bad programming practice? for example. If you wanted to ignore index errors, then catch just that exception, explicitly:

except IndexError:
    print 'failed'

and let other exceptions reach you so you can correct errors.

Unindent the counts line, it is not meant to be part of the pairwise() function:

def pairwise(s):
    a,b = itertools.tee(s)
    next(b)
    return zip(a,b)

counts = [[0 for _ in range(52)] for _ in range(52)]

with open('path/to/open') as file:
    for a,b in pairwise(char for line in file for word in line.split() for char in word):  
        given = ord(a) - ord('a')                                                            
        char = ord(b) - ord('a')                                                             
        try:
            counts[given][char] += 1
        except IndexError:
            # unknown character, ignore this one

Note that for anything outside lowercase ASCII letters (a-z) you'll produce indices that are either too large or negative. ord('a') is 97, but uppercase letters range from 65 through to 90. This'll mean you end up with integers ranging from -32 through to -5. That may not be what you wanted.

Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I get this Traceback: `counts[given][char] += 1 IndexError: list index out of range` – py.codan Jan 15 '15 at 14:03
  • What should I do? Im completly blank. – py.codan Jan 15 '15 at 14:22
  • Do about what? The exception? Did you read all of my answer? The code sample I give includes a `except IndexError` handler. – Martijn Pieters Jan 15 '15 at 14:30
  • @py.codan: the exception you show me is caught by that exception handler. If it isn't you are not using the same code. – Martijn Pieters Jan 15 '15 at 14:41
  • If u go up and see the question again, I've edited it a bit, so the `except` isn't there, and I get as you said `IndexError`. – py.codan Jan 15 '15 at 14:43
  • @py.codan: and if you read my full answer carefully, you'll see that I tell you that if you need to catch specific exceptions, then you need to do so. I warn against a **blanket** exception handler, not against specific exceptions. My answer then goes on to show you code that uses an explicit exception handler for `IndexError`. – Martijn Pieters Jan 15 '15 at 14:46
  • I'm simply lost in your answer. I'm thinking that I don't understand what's going on. – py.codan Jan 15 '15 at 14:53
  • The last codeblock is a full replacement for what you posted. Can you see what differences there are between the two? – Martijn Pieters Jan 15 '15 at 14:58
  • That I posted let the idle give us the IndexError, while at the last codeblock we are excepting this happen. – py.codan Jan 15 '15 at 15:02