I am not adding much to the party, except for some timing on the most noteworthy solutions:
import bisect
def grade_bis(score, thresholds=(0.9, 0.8, 0.7, 0.6), grades="ABCDF"):
i = bisect.bisect(thresholds[::-1], score)
return grades[-i - 1]
def grade_gen(score, thresholds=(0.9, 0.8, 0.7, 0.6), grades="ABCDF"):
return next((
grade
for grade, threshold in zip(grades, thresholds)
if score >= threshold), grades[-1])
def grade_enu(score, thresholds=(0.9, 0.8, 0.7, 0.6), grades="ABCDF"):
for i, threshold in enumerate(thresholds):
if score >= threshold:
return grades[i]
return grades[-1]
- using basic algebra -- although this does not generalize to arbitrary breakpoints, while the above do (based on @RiccardoBucco's answer):
def grade_alg(score, grades="ABCDF"):
return grades[-max(int(score * 10) - 5, 0) - 1]
- using a chain of
if
-elif
-else
(essentially, OP's approach, which also does not generalize):
def grade_iff(score):
if score >= 0.9:
return "A"
elif score >= 0.8:
return "B"
elif score >= 0.7:
return "C"
elif score >= 0.6:
return "D"
else:
return "F"
They all give the same result:
import random
random.seed(2)
scores = [round(random.random(), 2) for _ in range(10)]
print(scores)
# [0.96, 0.95, 0.06, 0.08, 0.84, 0.74, 0.67, 0.31, 0.61, 0.61]
funcs = grade_bis, grade_gen, grade_enu, grade_alg
for func in funcs:
print(f"{func.__name__:>12}", list(map(func, scores)))
# grade_bis ['A', 'A', 'F', 'F', 'B', 'C', 'D', 'F', 'D', 'D']
# grade_gen ['A', 'A', 'F', 'F', 'B', 'C', 'D', 'F', 'D', 'D']
# grade_enu ['A', 'A', 'F', 'F', 'B', 'C', 'D', 'F', 'D', 'D']
# grade_alg ['A', 'A', 'F', 'F', 'B', 'C', 'D', 'F', 'D', 'D']
# grade_iff ['A', 'A', 'F', 'F', 'B', 'C', 'D', 'F', 'D', 'D']
with the following timings (on n = 100000
repetitions into a list
):
n = 100000
scores = [random.random() for _ in range(n)]
base = list(map(funcs[0], scores))
for func in funcs:
res = list(map(func, scores))
is_good = base == res
print(f"{func.__name__:>12} {is_good} ", end="")
%timeit -n 4 -r 4 list(map(func, scores))
# grade_bis True 4 loops, best of 4: 46.1 ms per loop
# grade_gen True 4 loops, best of 4: 96.6 ms per loop
# grade_enu True 4 loops, best of 4: 54.4 ms per loop
# grade_alg True 4 loops, best of 4: 47.3 ms per loop
# grade_iff True 4 loops, best of 4: 17.1 ms per loop
indicating that the OP's approach is the fastest by far, and, among those that can be generalized to arbitrary thresholds, the bisect
-based approach is the fastest in the current settings.
As a function of the number of thresholds
Given that the linear search should be faster than binary search for very small inputs, it is interesting both to see where is the break-even point and to confirm that this application of binary search grows sub-linearly (logarithmically).
To do so, a benchmark as a function of the number of thresholds is provided (excluding thos:
import string
n = 1000
m = len(string.ascii_uppercase)
scores = [random.random() for _ in range(n)]
timings = {}
for i in range(2, m + 1):
breakpoints = [round(1 - x / i, 2) for x in range(1, i)]
grades = string.ascii_uppercase[:i]
print(grades)
timings[i] = []
base = [funcs[0](score, breakpoints, grades) for score in scores]
for func in funcs[:-2]:
res = [func(score, breakpoints, grades) for score in scores]
is_good = base == res
timed = %timeit -r 16 -n 16 -q -o [func(score, breakpoints, grades) for score in scores]
timing = timed.best * 1e3
timings[i].append(timing if is_good else None)
print(f"{func.__name__:>24} {is_good} {timing:10.3f} ms")
which can be plotted with the following:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data=timings, index=[func.__name__ for func in funcs[:-2]]).transpose()
df.plot(marker='o', xlabel='Input size / #', ylabel='Best timing / µs', figsize=(6, 4))
fig = plt.gcf()
fig.patch.set_facecolor('white')
to produce:

suggesting that the break-even point is around 5
, also confirming the linear growth of grade_gen()
and grade_enu()
, and the sub-linear growth of grade_bis()
.
NumPy-based approaches
Approaches that are capable of working with NumPy should be evaluated separately, as they take different inputs and are capable of processing arrays in a vectorized fashion.