2

When trying to understand how base58check works, in the referenced implementation by bitcoin, when calculating the size needed to hold a base58 encoded string, it used following formula:

// https://github.com/bitcoin/libbase58/blob/master/base58.c#L155
size = (binsz - zcount) * 138 / 100 + 1;

where binsz is the size of the input buffer to encode, and zcount is the number of leading zeros in the buffer. What is 138 and 100 coming from and why?

fluter
  • 13,238
  • 8
  • 62
  • 100
  • it means each 100 decoded character will become 138 encoded character or the other way – AaA Jan 19 '18 at 02:54
  • what's the theory behind that? – fluter Jan 19 '18 at 02:55
  • I'm not sure, I never studied bitcoin and philosophy behind choosing B58 encoding. my answer is based on B64 which is `* 4 / 3`. [Wiki](https://en.wikipedia.org/wiki/Base58) says `to avoid both non-alphanumeric characters and letters which might look ambiguous when printed` such as `0, o, O` and `I, l, 1` – AaA Jan 19 '18 at 03:06

1 Answers1

3

tl;dr It’s a formula to approximate the output size during base58 <-> base256 conversion.
i.e. the encoding/decoding parts where you’re multiplying and mod’ing by 256 and 58

Encoding output is ~138% of the input size (+1/rounded up):

n * log(256) / log(58) + 1  
(n * 138 / 100 + 1)

Decoding output is ~73% of the input size (+1/rounded up):

n * log(58) / log(256) + 1  
( n * 733 /1000 + 1)
MatejMecka
  • 1,448
  • 2
  • 24
  • 37
  • decoding output is usually 73% of input size. But you could also have an input that is almost all leading zeros, in which case the decoded would be ~100% of input size (each leading "1" character decodes into a single zero byte). – achow Mar 06 '22 at 22:30