3

I wrote a function that put my order values into multiple buckets like below. I have 18 buckets for now but how I can scale this function to maybe 40 buckets?

def cat(x):
    if x < 50:
        return '< $50'
    elif x <75:
        return '\\$50~$75'
    elif x <100:
        return '\\$75~$100'
    elif x<125:
        return '\\$100~$125'
    elif x<150:
        return '\\$125~$150'
    elif x<175:
        return '\\$150~$175'
    elif x<200:
        return '\\$175~$200'
    elif x<250:
        return '\\$200~$250'
    elif x<300:
        return '\\$250~$300'
    elif x<350:
        return '\\$300~$350'
    elif x<400:
        return '\\$350~$400'
    elif x<500:
        return '\\$400~$500'
    elif x<600:
        return '\\$500~$600'
    elif x<700:
        return '\\$600~$700'
    elif x<800:
        return '\\$700~$800'
    elif x<900:
        return '\\$800~$900'
    elif x<1000:
        return '\\$900~$1000'
    else:
        return '\\$1000 over'
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Data_Yoda
  • 71
  • 6
  • You need to add more constraints.. do you want to divide 1 to 1000 in smaller intervals? Or is there a upper different limit like 100,000? – pedro_bb7 Jan 09 '22 at 23:15

2 Answers2

5

I would use the bisect module:

import bisect

cutoffs = [50, 75, 100, 125, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, 800, 900, 1000]

def cat(x):
    i = bisect.bisect(cutoffs, x)
    if i == 0:
        return f'< ${cutoffs[0]}'
    elif i == len(cutoffs):
        return f'\\${cutoffs[-1]} over'
    else:
        lo, hi = cutoffs[i-1], cutoffs[i]
        return f'\\${lo}-${hi}'
Dennis
  • 2,249
  • 6
  • 12
3

You can condense the bounds for the buckets into a list, and then use ranges to make generating the list more concise. Then, you can perform a search over these bounds to return the desired string.

def cat(x):
    bounds = list(range(50, 200, 25)) + list(range(200, 400, 50)) + list(range(400, 1100, 100)) 
    if x < bounds[0]:
        return f'< ${bounds[0]}'
    if x >= bounds[-1]:
        return f'{bounds[-1]} over'
        
    # If you have a large number of bounds, this can be sped up further using binary search.
    for i in range(len(bounds) - 1):
        if bounds[i] <= x < bounds[i + 1]:
            return f'\\${bounds[i]}~${bounds[i + 1]}'
    
    # Should never reach here.
    raise ValueError('No bound found.')
BrokenBenchmark
  • 18,126
  • 7
  • 21
  • 33
  • 2
    this was what I was thinking. It's much easier to understand if one is not familiar with binary search (which is employed by the [other](https://stackoverflow.com/a/70646341/2089675) answer) – smac89 Jan 09 '22 at 23:26
  • 2
    there an alternative way of making the list of bounds, since py3.9 or so you can combine the unpack notation of functions calls with the list constructor and thus making that line shorter: `[*range(50, 200, 25), *range(200, 400, 50), *range(400, 1100, 100)]` – Copperfield Jan 09 '22 at 23:39