8

Say I have list [34523, 55, 65, 2]

What is the most efficient way to get [3,5,6,2] which are the most significant digits. If possible without changing changing each to str()?

Ozgur Vatansever
  • 49,246
  • 17
  • 84
  • 119
Nicky Feller
  • 3,539
  • 9
  • 37
  • 54
  • If you only have integrers, take a look at [this answer](http://stackoverflow.com/a/1068937/5509239). It's the C version of your question. – memoselyk Nov 27 '15 at 16:42

2 Answers2

17

Assuming you're only dealing with positive numbers, you can divide each number by the largest power of 10 smaller than the number, and then take the floor of the result.

>>> from math import log10, floor
>>> lst = [34523, 55, 65, 2]
>>> [floor(x / (10**floor(log10(x)))) for x in lst]
[3, 5, 6, 2]

If you're using Python 3, instead of flooring the result, you can use the integer division operator //:

>>> [x // (10**floor(log10(x))) for x in lst]
[3, 5, 6, 2]

However, I have no idea whether this is more efficient than just converting to a string and slicing the first character. (Note that you'll need to be a bit more sophisticated if you have to deal with numbers between 0 and 1.)

>>> [int(str(x)[0]) for x in lst]
[3, 5, 6, 2]

If this is in a performance-critical piece of code, you should measure the two options and see which is faster. If it's not in a performance-critical piece of code, use whichever one is most readable to you.

senshin
  • 10,022
  • 7
  • 46
  • 59
8

I did some timings using python 3.6.1:

from timeit import timeit

from math import *


lst = list(range(1, 10_000_000))


# 3.6043569352230804 seconds
def most_significant_str(i):
    return int(str(i)[0])


# 3.7258850016013865 seconds
def most_significant_while_floordiv(i):
    while i >= 10:
        i //= 10
    return i


# 4.515933519736952 seconds
def most_significant_times_floordiv(i):
    n = 10
    while i > n:
        n *= 10
    return i // (n//10)


# 4.661690454738387 seconds
def most_significant_log10_floordiv(i):
    return i // (10 ** (log10(i) // 1))


# 4.961193803243334 seconds
def most_significant_int_log(i):
    return i // (10 ** int(log10(i)))


# 5.722346990002692 seconds
def most_significant_floor_log10(i):
    return i // (10 ** floor(log10(i)))


for f in (
    'most_significant_str',
    'most_significant_while_floordiv',
    'most_significant_times_floordiv',
    'most_significant_log10_floordiv',
    'most_significant_int_log',
    'most_significant_floor_log10',
):
    print(
        f,
        timeit(
            f"""
for i in lst:
    {f}(i)
            """,
            globals=globals(),
            number=1,
        ),
    )

As you can see, for numbers in range(1, 10_000_000), int(str(i)[0]) is faster than other methods. The closest I could get was using a simple while loop:

def most_significant_while_floordiv(i):
    while i >= 10:
        i //= 10
    return i
AXO
  • 8,198
  • 6
  • 62
  • 63