How to convert bytes in number into a string of characters? (character representation of a number)

Question

How do I easily convert a number, e.g. 0x616263, equivalently 6382179 in base 10, into a string by dividing the number up into sequential bytes? So the example above should convert into 'abc'.

I've experimented with Array.pack but cant figure out how to get it to convert more than one byte in the number, e.g. [0x616263].pack("C*") returns 'c'. I've also tried 0x616263.to_s(256), but that throws an ArgumentError: invalid radix. I guess it needs some sort of encoding information?

(Note: Other datatypes in pack like N work with the example I've given above, but only because it fits within 4 bytes, so e.g. [0x616263646566].pack("N") gives cdef, not abcdef)

This question is vaguely similar to this one, but not really. Also, I sort of figured out how to get the hex representation string from a character string using "abcde".unpack("c*").map{|c| c.to_s(16)}.join(""), which gives '6162636465'. I basically want to go backwards.

I don't think this is an X-Y problem, but in case it is - I'm trying to convert a number I've decoded with RSA into a character string.

Thanks for any help. I'm not too experienced with Ruby. I'd also be interested in a Python solution (for fun), but I don't know if its right to add tags for two separate programming languages to this question.

Where are those numbers coming from? Aren't they just a byte stream, which could cut regularly every 4 or 8 bytes and interpret as an array of ints? — Eric Duminil, Mar 24 '17 at 21:17
The number is RSA-decoded ciphertext from one of the 'picoCTF' challenges (https://2014.picoctf.com/problems). It's one whole sequence of bytes represented as one number. I'm not sure what you mean by treating it as a byte stream - is that the same as our solutions below, just breaking the whole number up into bytes? — Aralox, Mar 24 '17 at 21:49
Okay, that would explain why the output isn't very standard. — Eric Duminil, Mar 24 '17 at 21:54

score 4 · Answer 1 · edited Mar 24 '17 at 20:55

4

To convert a single number 0x00616263 into 3 characters, what you really need to do first is separate them into three numbers: 0x00000061, 0x00000062, and 0x00000063.

For the last number, the hex digits you want are already in the correct place. But for the other two, you have to do a bitshift using >> 16 and >> 8 respectively.

Afterwards, use a bitwise and to get rid of the other digits:

num1 = (0x616263 >> 16) & 0xFF
num2 = (0x616263 >> 8) & 0xFF
num3 = 0x616263 & 0xFF

For the characters, you could then do:

char1 = ((0x616263 >> 16) & 0xFF).chr
char2 = ((0x616263 >> 8) & 0xFF).chr
char3 = (0x616263 & 0xFF).chr

Of course, bitwise operations aren't very Ruby-esque. There are probably more Ruby-like answers that someone else might provide.

edited Mar 24 '17 at 20:55

Eric Duminil

52,989
9
71
124

answered Mar 24 '17 at 07:44

Nathan

505
2
8

Thanks. This is the way I'd do it in other languages, yeah. Probably in some sort of loop that bitshifts both the source number and the mask, maybe log16(number) times to cover all the characters. – Aralox Mar 24 '17 at 08:03
Eric, do you want to elaborate? – Nathan Mar 24 '17 at 17:57
Yup. I am so far off, I don't even know what I was thinking before. – Nathan Mar 24 '17 at 18:18

score 3 · Answer 2 · edited May 23 '17 at 12:18

64 bit integers

If your number is smaller than 2**64 (8 bytes), you can :

convert the "big-endian unsigned long long" to 8 bytes
remove the leading zero bytes

Ruby

[0x616263].pack('Q>').sub(/\x00+/,'')
# "abc"
[0x616263646566].pack('Q>').sub(/\x00+/,'')
# "abcdef"

Python 2 & 3

In Python, pack returns bytes, not a string. You can use decode() to convert bytes to a String :

import struct
import re
print(re.sub('\x00', '', struct.pack(">Q", 0x616263646566).decode()))
# abcdef
print(re.sub('\x00', '', struct.pack(">Q", 0x616263).decode()))
# abc

Large numbers

With gsub

If your number doesn't fit in 8 bytes, you could use a modified version of your code. This is shorter and outputs the string correctly if the first byte is smaller than 10 (e.g. for "\t") :

def decode(int)
  if int < 2**64
    [int].pack('Q>').sub(/\x00+/, '')
  else
    nhex = int.to_s(16)
    nhex = '0' + nhex if nhex.size.odd?
    nhex.gsub(/../) { |hh| hh.to_i(16).chr }
  end
end

puts decode(0x616263) == 'abc'
# true
puts decode(0x616263646566) == 'abcdef'
# true
puts decode(0x0961) == "\ta"
# true
puts decode(0x546869732073656e74656e63652069732077617920746f6f206c6f6e6720666f7220616e20496e743634)
# This sentence is way too long for an Int64

By the way, here's the reverse method :

def encode(str)
  str.reverse.each_byte.with_index.map { |b, i| b * 256**i }.inject(:+)
end

You should still check if your RSA code really outputs arbitrary large numbers or just an array of integers.

With shifts

Here's another way to get the result. It's similar to @Nathan's answer, but it works for any integer size :

def decode(int)
  a = []
  while int>0
    a << (int & 0xFF)
    int >>= 8
  end
  a.reverse.pack('C*')
end

According to fruity, it's twice as fast as the gsub solution.

Thanks for your answer, I learned a lot from it. I guess the classic bitwise way is the best even in ruby! I like how you've avoided intermediate string operations in both forward and reverse methods. — Aralox, Mar 24 '17 at 21:48

score 2 · Answer 3 · answered Mar 24 '17 at 07:42

2

I'm currently rolling with this:

n = 0x616263

nhex = n.to_s(16)
nhexarr = nhex.scan(/.{1,2}/)
nhexarr = nhexarr.map {|e| e.to_i(16)}

out = nhexarr.pack("C*")

But was hoping for a concise/built-in way to do this, so I'll leave this answer unaccepted for now.

answered Mar 24 '17 at 07:42

Aralox

1,441
1
24
44

Note that your method doesn't work for `"\ta"` encoded as `0x0961` – Eric Duminil Mar 24 '17 at 13:08