2

Assume that I am looking for the index of some_array where some_array is equal to target. I know that python has list comprehension and np.where(), both of which would operate well for my purposes. But also assume that I want to do it with if-elif-else statements or with a for loop. The implementations if the length of the array is 3 would look like this:

if some_array[0]==target:
    return 0
elif some_array[1]==target:
    return 1
else:
    return 2 
for i in range(3):
    if some_array[i]==target:
        return i

So, when is it better to use a for loop over if-elif-else statement? I am mostly interested in the applications of it in python and in C, i.e., switch-cases.

My subquestions would be:

  • Do the compilers (or in Python's case, numba or cython) switch from a for loop to switch-cases or vice versa if it feels like the other approach is faster?
  • Is there a generally accepted good-coding practice that suggests a maximum length for an if-elif-else statements for better readability?
  • Is there a threshold for the length of the array or the number of iterations where one of them performs better than the other?

I apologise if this is asked before. I tried to check suggested questions but there were not helpful for my purposes.

Thanks in advance!

ck1987pd
  • 267
  • 2
  • 11
  • 1
    "Code reuse" is always better than "copy/paste/adapt"... except, perhaps, in the trivial case of 2 operations. – Fe2O3 Jan 05 '23 at 21:27
  • 1
    Switch-cases are generally replaced by a jump table which is not always faster than a bunch of conditional jumps. Some compilers adapt the generated code based on that. Still, the choice is dependent of the input dataset that the compiler does not know so it generally cannot produce an efficient code. All of this is related to a complex topics: [branch prediction](https://stackoverflow.com/a/11227902/12939557) and instruction decoding. Because of this, loops can actually be faster than unrolled code in some cases. This is why readability and profiling matters a lot. – Jérôme Richard Jan 06 '23 at 02:45
  • 1
    In most case, this is better to *let compilers do the job for you* as the choice is also architecture dependent. Good compilers can generate a fast SIMD code when the number of iteration is big (though SIMD instruction can actually be slower regarding the use-case and the input-data). They will typically not do that with a set of conditionals. – Jérôme Richard Jan 06 '23 at 02:48
  • @JérômeRichard Thanks a lot! This was as helpful as the answers. – ck1987pd Jan 06 '23 at 04:12

2 Answers2

6

So, when is it better to use a for loop over if-elif-else statement?

A loop is always clearer than an if-else if-else chain for this particular purpose. That is sufficient reason to prefer the loop, except possibly in the highly unlikely case that you trace a performance bottleneck to such a loop and find that it is relieved by changing to an if.

  • Do the compilers (or in Python's case, numba or cython) switch from a for loop to switch-cases or vice versa if it feels like the other approach is faster?

Loop unrolling is a standard optimization that many C compilers will perform when they think it will yield an improvement. A short loop might be unrolled completely out of existence, which is effectively the transformation you ask about.

I am not aware of compilers performing the reverse transformation.

  • Is there a generally expected good coding practice that suggests a maximum length for an if-elif-else statements to ensure ease of following the code?

Not as such.

Write clear code, and do not repeat yourself.

  • Is there a threshold for the length of the array or the number of iterations where one of them performs better than the other?

Not in particular. Performance has many factors, and few clear rules.

In general, first make it work, then, if necessary, make it fast.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
3

The performance of an if is not going to matter. A for loop would be better than a lot of ifs for readability.

In this specific case, just use a builtin, which is generally faster.

return some_array.index(target)

If it's possible that target is not in the list

try:
    return some_array.index(target)
except ValueError:
    return -1
Samathingamajig
  • 11,839
  • 3
  • 12
  • 34
  • Thanks, you have my upvote. But I did not ask for built-in methods. As I said, I know a few of them and not all loops ask for the value of a specific value of an array. – ck1987pd Jan 05 '23 at 21:15
  • The only way to know the specific performance is to benchmark your code on multiple machines. Most python implementations don't make many optimizations – Samathingamajig Jan 05 '23 at 21:19
  • I understand that it would be impossible to get an exact threshold value unless someone plays with Maxwell's demon for an extremely mathematical response. But there can still be a ballpark answer for a threshold. – ck1987pd Jan 05 '23 at 21:21
  • 2
    The ballpark answer is "just use a loop" IMO. I like the observation in the other answer that the compiler can easily unroll a loop in cases where it's advantageous, but it can't easily go in the other direction. That suggests that from the coder's perspective the best thing is *always* to use a loop in the code instead of trying to second-guess the compiler (because the compiler is almost always going to be better than you at optimizing machine code). – Samwise Jan 05 '23 at 23:38