I need to convert strings which contain the memory usage in bytes, like: 1048576
(which is 1M) into exactly that, a human-readable version, and visa-versa.
Note: I looked here already: Reusable library to get human readable version of file size?
And here (even though it isn't python): How to convert human readable memory size into bytes?
Nothing so far helped me, so I looked elsewhere.
I have found something that does this for me here: http://code.google.com/p/pyftpdlib/source/browse/trunk/test/bench.py?spec=svn984&r=984#137 or, for shorter URL: http://goo.gl/zeJZl
The Code:
def bytes2human(n, format="%(value)i%(symbol)s"):
"""
>>> bytes2human(10000)
'9K'
>>> bytes2human(100001221)
'95M'
"""
symbols = ('B', 'K', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y')
prefix = {}
for i, s in enumerate(symbols[1:]):
prefix[s] = 1 << (i+1)*10
for symbol in reversed(symbols[1:]):
if n >= prefix[symbol]:
value = float(n) / prefix[symbol]
return format % locals()
return format % dict(symbol=symbols[0], value=n)
And also a function for conversion the other way (same site):
def human2bytes(s):
"""
>>> human2bytes('1M')
1048576
>>> human2bytes('1G')
1073741824
"""
symbols = ('B', 'K', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y')
letter = s[-1:].strip().upper()
num = s[:-1]
assert num.isdigit() and letter in symbols
num = float(num)
prefix = {symbols[0]:1}
for i, s in enumerate(symbols[1:]):
prefix[s] = 1 << (i+1)*10
return int(num * prefix[letter])
This is great and all, but it has some information loss, example:
>>> bytes2human(10000)
'9K'
>>> human2bytes('9K')
9216
To try to solve this, I change the formatting on the function bytes2human
Into: format="%(value).3f%(symbol)s")
Which is much nicer, giving me these results:
>>> bytes2human(10000)
'9.766K'
but when I try to convert them back with the human2bytes
function:
>>> human2bytes('9.766K')
Traceback (most recent call last):
File "<pyshell#366>", line 1, in <module>
human2bytes('9.766K')
File "<pyshell#359>", line 12, in human2bytes
assert num.isdigit() and letter in symbols
AssertionError
This is because of the .
So my question is, how can I convert a human-readable version back into byte-version, without data-loss?
Note: I know that 3 decimal places is also a little bit of data loss. But for the purposes of this question, lets ignore that for now, I can always change that to something greater.