0

Background:

I am currently writing a program in Python to parse through IIS web server log files and detect 500 and 403 errors. Somewhere in the program I have the following if statement:

if line.split(" ")[fmap["statusCode"]] == "500" or line.split(" ")[fmap["statusCode"]] == "403":
    # cool stuff happens below

Question:

What I am wondering is, would it be more efficient to split line and store the resulting list in a variable, since it is used twice in the if statement? For example, is the following code more efficient?

# Is this faster and more efficient?
chunks = line.split(" ")
if chunks[fmap["statusCode"]] == "500" or chunks[fmap["statusCode"]] == "403":
    # cool stuff happens

OR, does Python remember the value of line.split(" ") the second time it is used in the if statement since it was already evaluated in the first clause of the same if statement? Does it recalculate line.split(" ") the second time, which would lower efficiency, especially for large lines?
I am especially interested on what is going on in the memory when these different if statements are run.

Any insight on this is appreciated!

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
Captain Jack Sparrow
  • 971
  • 1
  • 11
  • 28

1 Answers1

2

No, line.split(" ") is not cached. You are currently recalculating that multiple times. Definitely prefer the second way.

The other options would be something like

if line.split(" ")[fmap["statusCode"]] in {"500", "403"}

Which makes it so you don't need that expression twice.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117