0
fhand = open('mbox-short.rtf')

for line in fhand:
    words = line.split()
    if len(words) == 0:
        continue
    if words[0] != 'From':
        continue
    print(words[2])

This example works and takes care of empty lines in the document that I am trying to read. I first came up with the solution below though, which is seemingly a more elegant way to put it:

for line in fhand:
    words = line.split()
    if words[0] != 'From' or len(words) == 0: 
        continue
    print(words[2])

Yet I get a 'list index out of range' - error in this example, and I don't quite understand why? It seems Python tries the third line subsequently from the left to the right and produces an error if the first condition doesn't work; but isn't that detrimental to the purpose of an 'or'-condition? I'm not yet getting a grasp of the mechanics behind it.

KayAl
  • 3
  • 2
  • Change the order of your if to `len(words) == 0 or words[0] != 'From'`, this way if the words list is empty it will not check the second part, if the words list is not empty only then will is check the index field 0 which we know will exist since words is not empty – Chris Doyle Feb 16 '21 at 10:15

1 Answers1

2

You get index out of range because for an empty line words = [] so words[0] throws an error.

Change the order to allow short-circuiting fix that for you:

    if len(words) == 0 or words[0] != 'From': 

But, argueably it would be more neat to reverse the logic:

for line in fhand:
    words = line.split()
    if len(words) > 0 and words[0] == 'From': 
        print(words[2])

This way, again, you only access words[0] if it actually exists, thanks to and short-circuiting.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61