1

I have some binary data that I want to convert to something more easily readable and copy/pastable.

The binary data shows up like this

?Q?O?,???W%ʐ):?g????????

Which is pretty ugly. I can convert it to hex with:

value.unpack("H*").first

But since hexadecimal only has 16 characters, it isn't very compressed. I end up with a string that is hundreds of chars long.

I'd prefer a format that uses letters (capitalized and lowercase), numbers, and basic symbols, to make best use of the possible values. What can I use?

I'd also prefer something that comes built-in to Ruby, that doesn't require another library. Unfortunately I can't require another library unless it's really well known and popular, or ideally built-in to Ruby.

I tried the stuff from http://apidock.com/ruby/String/unpack and couldn't find anything.

Some Guy
  • 12,768
  • 22
  • 58
  • 86
  • How are you reading/loading the binary data to begin with? What's the source of the data (e.g. is it streaming such as coming in over a network? or static in length, like a file?)? How are you rendering/outputting the raw data so that it shows up like `?Q?O?,???W%ʐ):?g????????`? – Anthony E May 05 '16 at 07:10

3 Answers3

2

A simple method uses Base64 encoding to encode the value. It's very similar to Hex encoding (which is Base16), but uses a longer dictionary.

Base64 strings, when properly prepared, contain only printable characters. This is a benefit for copy/paste and for sharing.

The secondary benefit is that it has a 3:4 encoding ratio, which means that it's reasonably efficient. A 3:4 encoding ration means that for each 3 bytes in the input, 4 bytes are used to encode (75% efficient); Hex encoding is a less efficient 1:2 encoding ratio, or for each 1 byte of input, 2 bytes are used to encode (50% efficient).

You can use the Ruby standard library Base64 implementation to encode and decode, like so:

require "base64"

encoded = Base64.encode64("Taste the thunder!") # <== "VGFzdGUgdGhlIHRodW5kZXIh\n"
decoded = Base64.decode64(encoded)              # <== "Taste the thunder!"

Note that there is a (mostly) URL-safe version, as well, so that you can include an encoded value anywhere in a URL without requiring any additional URL encoding. This would allow you to pass information in a URL in an obscured way, and especially information that normally wouldn't be easily passed in that manner.

Try this to encode your data:

encoded_url_param = Base64.urlsafe_encode64("cake+pie=yummy!")  # <== "Y2FrZStwaWU9eXVtbXkh"
decoded_url_param = Base64.urlsafe_decode64(encoded_url_param)  # <== "cake+pie=yummy!"

Using Base64 in a URL, while actually not "security", will help keep prying eyes from your data and intent. The only potential downside to using Base64 values in a URL is that the URL must remain case-sensitive, and some applications don't honor that requirement. See the Should URL be case sensitive SO question for more information.

Community
  • 1
  • 1
Michael Gaskill
  • 7,913
  • 10
  • 38
  • 43
1

Sounds to me like you want base 64. It is part of the standard library:

require 'base64'
Base64.encode64(some_data)

Or using pack,

[some_data].pack("m")

The resulting data is about 4/3 the size of the input.

Frederick Cheung
  • 83,189
  • 8
  • 152
  • 174
1

Base36 string encoding is a reasonable alternative to both Base64 and Hex encoding, as well. In this encoding method, only 36 characters are used, typically the ASCII lowercase letters and the ASCII numbers.

There's not a Ruby API that specifically does this, however this SO answer Base36 Encode a String shows how to do this efficiently in Ruby:

Encoding to Base36:

encoded = data.unpack('H*')[0].to_i(16).to_s(36)

Decoding from Base36:

decoded = [encoded.to_i(36).to_s(16)].pack 'H*'

Base36 encoding will work well when used in URLs, similarly to Base64, however it is not sensitive to the case sensitivity issues that Base64 is.

Note that Base36 string encoding should not be confused with base 36 radix integer encoding, which simply converts an integer value to the corresponding base 36 encoding. The integer technique uses String#to_i(36) and Fixnum#to_s(36) to accomplish its goals.

Community
  • 1
  • 1
Michael Gaskill
  • 7,913
  • 10
  • 38
  • 43