4

I have a program whose input has changed. It originally worked when using integers and the integers had units like: "kbps", "Mbps", "Gbps". An example:

10kbps
20Mbps
20Gbps

I used "if/then" to convert all strings to bps:

if "kbps" in bandwidth:
    bw=bandwidth.replace('kbps', '000')
elif "Mbps" in bandwidth:
    bw=bandwidth.replace('Mbps', '000000')
elif "Gbps" in bandwidth:
    bw=bandwidth.replace('Gbps', '000000000')
else:
    bw='0'

I'd print "bw" as an integer "int(bw)", and it worked fine. Now, the input has changed. Here are samples of actual inputs:

3.838Mbps
100kbps
126.533kbps
5.23Mbps
100Mbps
1.7065Gbps
20Gbps

The numbers are not integers. Not only can I not print because it is not an integer, the unit conversion doesn't work with decimals. For example, 3.838Mbps becomes 3.838000000.

Can someone suggest an efficient way to work with these inputs? I can't find the right balance of splitting, regexp matching, etc. and am wondering if there are some methods I don't know about.

martineau
  • 119,623
  • 25
  • 170
  • 301
user3746195
  • 346
  • 1
  • 16
  • 1
    Not a full answer: I think I'd split dots and digits from letters using regexp, look up multiplier in a table, convert digits to a float (or Decimal, if humans are going to look at it and will find the float precision offputing), and multiply. – Max Nov 14 '18 at 22:33
  • What if you split your expression between the float part and the string part (regexp), and then multiply the float part by a power of 10 based on the value of the string part? – Patol75 Nov 14 '18 at 22:38

4 Answers4

4

I'd do something like:

import re

UNITS = {
    'k': 1e3,
    'm': 1e6,
    'g': 1e9,
    't': 1e12,
}

def parse_bps(s):
    m = re.match(r'^([0-9]+(\.[0-9]*)?)([tgmk])?bps$', s, re.IGNORECASE)
    if not m:
        raise ValueError(f"unsupported value for parse_bps: {repr(s)}")
    val = float(m.group(1))
    unit = m.group(3)
    if unit:
        val *= UNITS[unit.lower()]
    return val


tests = [
    '10bps', '10kbps', '20Mbps', '20Gbps', '3.838Mbps',
    '3.838Mbps', '100kbps', '126.533kbps', '5.23Mbps',
    '100Mbps', '1.7065Gbps', '20Gbps',
]

for s in tests:
    print(f'  {repr(s)} = {parse_bps(s)}')

though it's probably better not to ignore case. SI units officially make big differences based on case, generally speaking capitals meaning multiply and lower case meaning divide.

Sam Mason
  • 15,216
  • 1
  • 41
  • 60
4

I think it might be better to use regular expressions here, and capture the "number part", and the "unit prefix part":

import re

bandwidth_rgx = re.compile(r'^(\d*(?:\.\d*)?)\s*([GMk]?)(?:bps)?$')

so now we can match a given string, and obtain the number part, and the unit prefix part:

>>> bandwidth_rgx.search('126.533kbps')[1]
'126.533'
>>> bandwidth_rgx.search('126.533kbps')[2]
'k'

so we can make a conversion dictionary, like:

unit_prefix = {
    '': 1,
    'k': 1000,
    'M': 1000000,
    'G': 1000000000
}

and use a function to obtain the bandwidth, for example as a float:

def get_bandwidth(text):
    m = bandwidth_rgx.search(text)
    return float(m[1]) * unit_prefix[m[2]]

for example:

>>> get_bandwidth('3.838Mbps')
3838000.0
>>> get_bandwidth('100kbps')
100000.0
>>> get_bandwidth('126.533kbps')
126533.0
>>> get_bandwidth('5.23Mbps')
5230000.0
>>> get_bandwidth('100Mbps')
100000000.0
>>> get_bandwidth('1.7065Gbps')
1706500000.0
>>> get_bandwidth('20Gbps')
20000000000.0
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
4

You can cast value before kbps, Mbps or Gbps to float and multiply it by 1000, 1000,000 and so on. That will work for either case.

For Example:

if "kbps" in bandwidth:
    bw = bandwidth.replace('kbps', '')
    bw = str(float(bw) * 1000)
elif "Mbps" in bandwidth:
    bw = bandwidth.replace('Mbps', '')
    bw = str(float(bw) * 1000* 1000)
elif "Gbps" in bandwidth:
    bw = bandwidth.replace('Gbps', '')
    bw = str(float(bw) * 1000 * 1000 * 1000)
else:
    bw='0'
Mamoon Raja
  • 487
  • 3
  • 8
1

Answer that requires bare minimum code changes and assumes your input is safe:

if "kbps" in bandwidth:
    bw=bandwidth.replace('kbps', '* 1e3')
elif "Mbps" in bandwidth:
    bw=bandwidth.replace('Mbps', '* 1e6')
elif "Gbps" in bandwidth:
    bw=bandwidth.replace('Gbps', '* 1e9')
else:
    bw='0'
bw_int = eval(bandwidth)

Of course, make sure your input is safe, and there are some security and performance considerations for eval (Why is using 'eval' a bad practice?), but sometimes, you just want to get something out quickly :)

Athena
  • 3,200
  • 3
  • 27
  • 35