Efficient Conversion of a Binary Number to Hexadecimal String

Question

I am writing a program that converts a binary value's hexadecimal representation to a regular string. So each character in the hex representation would convert to two hexadecimal characters in the string. This means the result will be twice the size; a hexadecimal representation of 1 byte would need two bytes in a string.

Hexadecimal Characters

0123456789                    ;0x30 - 0x39
ABCDEF                        ;0x41 - 0x46

Example

0xF05C1E3A                    ;hex
4032568890                    ;dec

would become

0x4630354331453341            ;hex
5057600944242766657           ;dec

Question?

Are there any elegant/alternative(/interesting) methods for converting between these states, other than a lookup table, (bitwise operations, shifts, modulo, etc)? I'm not looking for a function in a library, but rather how one would/should be implemented. Any ideas?

Putting bitwise operations and modulo at the same level makes me wonder what you are looking for. Anyway, I usually use 16 byte LUT with `'0123456789ABCDEF` string, per each nibble (4 bits). It's a bit longer (binary size, including the LUT) than pure-code solution, but simpler to write in a hurry. //// edit: and you are converting [binary] value into hexadecimal string... there's nothing "hexa" about your source value, except you picked hexa formatting for your example. (formatting is not part of value). — Ped7g, Aug 09 '17 at 18:59
I can see how you go from hex `0xF05C1E3A` to `0x4630354331453341` which doubles the length 8 to 16 (without the `0x`) , but not how you go from dec `4032568890` to `5057600944242766657` which goes from length 10 to length 19. — Weather Vane, Aug 09 '17 at 19:02
@WeatherVane it's just the same value shown in both formattings. — Ped7g, Aug 09 '17 at 19:04
@Ped7g It's not encryption if that's what you mean. I'm not sure what you mean by LUT? Yes, you're right, I'll edit that. — Display name, Aug 09 '17 at 19:07
"Look Up Table", as shown in the first comment from @Ped7g where you index an ASCII array by a 4-bit nibble value. — Weather Vane, Aug 09 '17 at 19:08
Note this is not hex to ... this is binary to .... I assume you mean the decimal values are in ASCII as in a string, you are not being very clear as to what your input and output are. now if you really have string of hex values, 0x30, 0x31, etc, then by showing us 0xABCD is misleading, do you have the ASCII string 0x30 then 'x' then 0x41, 0x42 etc? Or is it the binary value 0xABCDEF? If the former then you need to convert from an ascii string of hex values to binary then convert from binary to as ascii string representing the number in decimal. — old_timer, Aug 09 '17 at 20:22
I assume you know how to convert 12345 seconds into 3:25:45? it is exactly the same as that instead of base 10 to base 60 you are going from base 2 to base 10 or perhaps since your question is vague from base 16 to base 2 (trivial), then from base 2 to base 10 (like the 10 to 60 conversion above). The only time there is a trick is if the base you are converting to/from is a power of the other base (hex to binary, octal to binary). — old_timer, Aug 09 '17 at 20:26
on top of this you also have maybe a from and sounds like a to ASCII mixed in this as well, do that first and last as needed, decimal to/from ascii is trivial, hex to/from is mostly trivial has a conditional in there or use a look up table. — old_timer, Aug 09 '17 at 20:28
@old_timer I have edited the question to clear up confusion. I'm leaving the hexadecimal values in the question because they were helpful while writing it - I was using it as a LUT. — Display name, Aug 09 '17 at 20:46
" This means the result will be twice the size; " --> close. A _string_ has a _null character_, so the size of the string will be 2*N + 1. — chux - Reinstate Monica, Aug 09 '17 at 21:34
@chux it's not a string, it's an integer that contains a character code in every 8 bits. — Mark Ransom, Aug 09 '17 at 23:11
@MarkRansom Perhaps OP does not want what was asked as a _string_ was requested, a "regular string". — chux - Reinstate Monica, Aug 10 '17 at 02:14
@chux you could be right. But as asked, it certainly *looks* like they're expecting a single number, especially when inquiring about bit manipulation methods. — Mark Ransom, Aug 10 '17 at 02:17

Mark Ransom · Accepted Answer · 2017-08-10T01:32:39.210

6

Here's a solution with nothing but shifts, and/or, and add/subtract. No loops either.

uint64_t x, m;
x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8)  | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4)  | (x & 0x000f000f000f000fLL);
x += 0x0606060606060606LL;
m = ((x & 0x1010101010101010LL) >> 4) + 0x7f7f7f7f7f7f7f7fLL;
x += (m & 0x2a2a2a2a2a2a2a2aLL) | (~m & 0x3131313131313131LL);

Above is the simplified version I came up with after a little time to reflect. Below is the original answer.

uint64_t x, m;
x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8) | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4) | (x & 0x000f000f000f000fLL);
x += 0x3636363636363636LL;
m = (x & 0x4040404040404040LL) >> 6;
x += m;
m = m ^ 0x0101010101010101LL;
x -= (m << 2) | (m << 1);

See it in action: http://ideone.com/nMhJ2q

edited Aug 10 '17 at 01:32

answered Aug 09 '17 at 23:08

Mark Ransom

299,747
42
398
622

Good use of `LL` (although it is not truly needed here) along with some "magic". – chux - Reinstate Monica Aug 10 '17 at 02:22
@chux it depends on the compiler. I'm used to Microsoft's, which still considers `long` to be only 32 bits. – Mark Ransom Aug 10 '17 at 02:29
1

All _integer constants_ are positive, at least since C99 - regardless of `int/long` length. The `uint64_t & _hexadecimal constant_` will convert the _hexadecimal constant_ to `uint64_t` even those that lack an `LL`. If anything a `U` is valuable. IAC, a worthy answer – chux - Reinstate Monica Aug 10 '17 at 02:34
@chux perhaps C++ is different in that regard? That too is what I'm used to. I'm glad you're here to add some clarifying remarks. – Mark Ransom Aug 10 '17 at 03:12
I like all the answers, but this takes the biscuit. Thank you! – Display name Aug 11 '17 at 17:43
@MarkRansom: I think I found a slightly simpler version ;-) – chqrlie Aug 12 '17 at 00:45

harold · Answer 2 · 2021-01-03T11:57:48.033

5

Spreading out the nibbles to bytes is easy with pdep:

spread = _pdep_u64(raw, 0x0F0F0F0F0F0F0F0F);

Now we'd have to add 0x30 to bytes in the range 0-9 and 0x41 to higher bytes. This could be done by SWAR-subtracting 10 from every byte and then using the sign to select which number to add, such as (not tested)

H = 0x8080808080808080;
ten = 0x0A0A0A0A0A0A0A0A
cmp = ((spread | H) - (ten &~H)) ^ ((spread ^~ten) & H); // SWAR subtract
masks = ((cmp & H) >> 7) * 255;
// if x-10 is negative, take 0x30, else 0x41
add = (masks & 0x3030303030303030) | (~masks & 0x3737373737373737);
asString = spread + add;

That SWAR compare can probably be optimized since you shouldn't need a full subtract to implement it.

There are some different suggestions here, including SIMD: http://0x80.pl/articles/convert-to-hex.html

edited Jan 03 '21 at 11:57

answered Aug 09 '17 at 19:01

harold

61,398
6
86
164

1

I wasn't aware of PDEP so I had to [look it up](https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#Parallel_bit_deposit_and_extract). Very interesting, but not universally available. – Mark Ransom Aug 09 '17 at 19:18
1

@MarkRansom not universally fast either, it's pretty slow on Ryzen. But I thought it would be interesting. – harold Aug 09 '17 at 19:19
1

I figured out a way to spread the bytes without PDEP, have a look at my answer. – Mark Ransom Aug 10 '17 at 01:57
You have a bug in this code - you want to add 0x41 but you also need to subtract 10 at the same time, so 10->'A' rather than 0->'A'. I considered fixing it for you but I'll let you do the honors. – Mark Ransom Aug 11 '17 at 22:37
@MarkRansom right thanks, I've changed that constant – harold Aug 11 '17 at 22:39
[How to convert a binary integer number to a hex string?](https://stackoverflow.com/q/53823756) has some x86 asm SIMD versions, SSE2 through AVX512VBMI. (Your SIMD link is dead, perhaps http://0x80.pl/articles/convert-to-hex.html? ). There's also an implementation in an old version of Agner Fog's VCL using conditional add instead of a `pshufb` LUT: https://github.com/darealshinji/vectorclass/blob/master/special/decimal.h#L828 – Peter Cordes Jan 03 '21 at 11:32

chqrlie · Answer 3 · 2017-08-12T00:50:00.090

4

A slightly simpler version based on Mark Ransom's:

uint64_t x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8)  | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4)  | (x & 0x000f000f000f000fLL);
x =  (x + 0x3030303030303030LL) +
   (((x + 0x0606060606060606LL) & 0x1010101010101010LL) >> 4) * 7;

And if you want to avoid the multiplication:

uint64_t m, x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8)  | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4)  | (x & 0x000f000f000f000fLL);
m =  (x + 0x0606060606060606LL) & 0x1010101010101010LL;
x =  (x + 0x3030303030303030LL) + (m >> 1) - (m >> 4);

edited Aug 12 '17 at 00:50

answered Aug 12 '17 at 00:43

chqrlie

131,814
10
121
189

Yes, it's simpler. – Mark Ransom Aug 12 '17 at 04:12
This is good too! – Display name Aug 12 '17 at 12:20
Is it intended to be faster? – AleXoundOS Aug 21 '22 at 17:10
1

@AleXoundOS: there are fewer expressions... whether it compiles to faster code depends on the compiler and settings used, but yes this is the intent. – chqrlie Aug 21 '22 at 17:17
For those, who needs to **preserve input byte order** (interpreting input as big endian integer): https://ideone.com/lXrk82. For this the change is in the 3rd assignment: `((x & 0x000f000f000f000fLL) << 8) | ((x & 0x00f000f000f000f0LL) >> 4)`. – AleXoundOS Aug 21 '22 at 22:39

Ped7g · Answer 4 · 2017-08-09T21:36:03.153

2

A LUT (lookup table) C++ variant. I didn't check the actual machine code produced, but I believe any modern C++ compiler can catch the idea and compile it well.

static const char nibble2hexChar[] { "0123456789ABCDEF" };
     // 17B in total, because I'm lazy to init it per char

void byteToHex(std::ostream & out, const uint8_t value) {
    out << nibble2hexChar[value>>4] << nibble2hexChar[value&0xF];
}

// this one is actually written more toward short+simple source, than performance
void dwordToHex(std::ostream & out, uint32_t value) {
    int i = 8;
    while (i--) {
        out << nibble2hexChar[value>>28];
        value <<= 4;
    }
}

EDIT: For C code you have just to switch from std::ostream to some other output means, unfortunately your question lacks any details, what you are actually trying to achieve and why you don't use the built-in printf family of C functions.

For example C like this can write to some char* output buffer, converting arbitrary amount of bytes:

/**
 * Writes hexadecimally formatted "n" bytes array "values" into "outputBuffer".
 * Make sure there's enough space in output buffer allocated, and add zero
 * terminator yourself, if you plan to use it as C-string.
 * 
 * @Returns: pointer after the last character written.
 */
char* dataToHex(char* outputBuffer, const size_t n, const unsigned char* values) {
    for (size_t i = 0; i < n; ++i) {
        *outputBuffer++ = nibble2hexChar[values[i]>>4];
        *outputBuffer++ = nibble2hexChar[values[i]&0xF];
    }
    return outputBuffer;
}

And finally, I did help once somebody on code review, as he had performance bottleneck exactly with hexadecimal formatting, but I did there the code variant conversion, without LUT, also the whole process and other answer + performance measuring may be instructional for you, as you may see that the fastest solution doesn't just blindly convert result, but actually mix up with the main operation, to achieve better performance overall. So that's why I'm wonder what you are trying to solve, as the whole problem may often allow for more optimal solution, if you just ask about conversion, printf("%x",..) is safe bet.

Here is that another approach for "to hex" conversion: fast C++ XOR Function

edited Aug 09 '17 at 21:36

answered Aug 09 '17 at 19:45

Ped7g

16,236
3
26
63

come on! It is not C as tagged – 0___________ Aug 09 '17 at 20:28
You're right it is much easier to code. I like this! – Display name Aug 09 '17 at 20:40
@PeterJ ups, sorry, I'm usually checking just assembly questions on SO, and then whether the C or C++ is added doesn't matter that much for me. To OP: but in C you have this beautiful thing like clib, which has for example `*printf("%08x", ...)`. Unless you really have measurable (!) performance bottleneck in formatting, it's always safer to use already well tested solution. Also if you have problem to convert this C++ into C, let me know, but then the question arises, how you want the output be done (C++ streams are reasonable general interface how to write output idea simply). – Ped7g Aug 09 '17 at 21:16
@Ped7g I use both so you don't have to proof that C++ is a very decent language :). I am not taking the part in this religious war. OP wanted C so probably he does not know C++. – 0___________ Aug 09 '17 at 21:19
Either is fine, I can understand your answer. – Display name Aug 09 '17 at 21:38
@PeterJ added also C example (the algorithm is actually without any change in both languages, it's just about output), and link to third variant, how to do it without LUT just in code, plus it's a real world example how overall tailor-fit architecture can speed up code several times. – Ped7g Aug 09 '17 at 21:42
1

@Ped7g I have added my one as well (more universal any base < 255 and negative numbers as well) – 0___________ Aug 09 '17 at 21:55
@Ped7g I think you have made a mistake :). Your string has the reverse order of digits. And actually it is an UB as you forgot the trailing zero as well – 0___________ Aug 09 '17 at 22:00
@PeterJ no, and no. The C variant is converting array of bytes, not `dwords`. And it doesn't add zero terminator on purpose, in case it's just part of output, or written into file. As documented in the comment about the function. I have seen your universal converter, looks quite good. For binary -> hex it's a bit slow, but the any-base feature makes that worth in some cases. ... EDIT: thinking more about the wording of question, the trailing zero would be maybe closer to OP intention. It's difficult to work for me without accurate specs, as I tend to abuse every tiny detail ;) – Ped7g Aug 09 '17 at 22:02

0___________ · Answer 5 · 2017-08-09T21:08:08.107

2

A bit more decent conversion from the the integer to the string any base from 2 to length of the digits

char *reverse(char *);

const char digits[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *convert(long long number, char *buff, int base)
{
    char *result = (buff == NULL || base > strlen(digits) || base < 2) ? NULL : buff;
    char sign = 0;

    if (number < 0)
    {
         sign = '-';
        number = -number;
    }
    if (result != NULL)
    {
        do
        {
            *buff++ = digits[number % base];
            number /= base;
        } while (number);
        if(sign) *buff++ = sign;
        *buff = 0;
        reverse(result);
    }
    return result;
}


char *reverse(char *str)
{
    char tmp;
    int len;

    if (str != NULL)
    {
        len = strlen(str);
        for (int i = 0; i < len / 2; i++)
        {
            tmp = *(str + i);
            *(str + i) = *(str + len - i - 1);
            *(str + len - i - 1) = tmp;

        }
    }
    return str;
}

example - counting from -50 to 50 decimal in base 23

-24     -23     -22     -21     -20     -1M     -1L     -1K     -1J     -1I     -1H     -1G     -1F     -1E     -1D
-1C     -1B     -1A     -19     -18     -17     -16     -15     -14     -13     -12     -11     -10     -M      -L
-K      -J      -I      -H      -G      -F      -E      -D      -C      -B      -A      -9      -8      -7      -6
-5      -4      -3      -2      -1      0       1       2       3       4       5       6       7       8       9
A       B       C       D       E       F       G       H       I       J       K       L       M       10      11
12      13      14      15      16      17      18      19      1A      1B      1C      1D      1E      1F      1G
1H      1I      1J      1K      1L      1M      20      21      22      23      24

edited Aug 09 '17 at 21:08

answered Aug 09 '17 at 20:53

0___________

60,014
4
34
74

You don't special-case base10 or base16, so the compiler will use an actual `div` instruction per digit. That's going to really suck compared to [what it can do for a compile-time-constant division by 10](https://stackoverflow.com/questions/41183935/why-does-gcc-use-multiplication-by-a-strange-number-in-implementing-integer-divi). Or especially for div / mod by 16 or 8, which is just a shift or mask. – Peter Cordes Aug 10 '17 at 01:08
Also, it would be faster to decrement a pointer starting from the end of a `char tmp[sizeof(long long)*CHAR_BIT]` local buffer, and then memcpy from that to the right position in `buff` (the start, or after a leading `-`). Then you don't have to reverse. You also don't have to `strlen()` when you already have that information. – Peter Cordes Aug 10 '17 at 01:55
This is broken for `LLONG_MIN` on 2's complement machines, because `-LLONG_MIN` is still negative. – Peter Cordes Aug 10 '17 at 01:58
@Peter Cordes 1. It is universal one for the forum. One can make it as optimal as he onts by adding special cases or changing the number type. 2. No. As you do not know converted string length and you do not know the buff size. 3. Yes it is. But works in the rest 18446744073709551615 cases – 0___________ Aug 10 '17 at 07:22
`sizeof(long long) * CHAR_BIT` is always enough space for just the digits. There's a very low upper-bound on that, usually 64, so creating an automatic storage array of that size should always be fine. You already check that `base>=2`, so you know for certain that there are at most `sizeof(long long) * CHAR_BIT` digits (any base higher than 2 will take fewer digits). This is why I suggested a tmp buffer with automatic storage, rather than `memmove` inside inside the caller's destination, because *that*'s the only thing you can't do. – Peter Cordes Aug 10 '17 at 07:33
re: point 1: the question title says "efficient". You're giving up *massive* amounts of efficiency by generalizing this for any base. – Peter Cordes Aug 10 '17 at 07:35
It does not make too much sanse as buff may be shorter. This is the **correct** way of doing it. – 0___________ Aug 10 '17 at 07:35
Of course buff may be shorter. That's why you only copy as many non-zero digits as you actually produce, not the whole `tmp`. `tmp[]` just has to be big enough for the worst-case, to avoid making it variable-sized. – Peter Cordes Aug 10 '17 at 07:37
Anyway, this is a potentially-useful answer for someone, but it doesn't belong on this question. I'm not going to downvote it, since it's at least mostly correct, just answering a different question. – Peter Cordes Aug 10 '17 at 07:40
@Peter Cordes you should point it to the standard library authors as well. It is an incorrect approach in the general case. – 0___________ Aug 10 '17 at 07:42
efficiency is a quite broad term - efficient in speed, memory, size? I tool a look how the standard library functions look like - same as mine. report to https://gcc.gnu.org/bugzilla/ – 0___________ Aug 10 '17 at 07:51
The OP here is clearly looking for execution speed, given that it's tagged assembly. If the standard library implementations are slow, that's probably why this question got asked in the first place. Which C library function did you look at? Part of sprintf? Can you link the source you were looking at? (I am actually curious about this part, not just arguing with you). The variable-base functions like `strtol` all for converting in the other direction, where you multiply by the base. – Peter Cordes Aug 10 '17 at 07:57
depending on your compiler and platform google for example "arm gcc ltoa source code", download the latest build of the toolchain and see the sources- you have a lots of options. I intentionally do not give you a particular one - as you may say that I have chosen one to proof my point – 0___________ Aug 10 '17 at 08:03
On modern platforms idiv or sdiv instructuions are very fast and the overhead compating to and-ing and shifting is marginal. If you are on 32 bit platform just change the `long long` to `long` and on many platforms you will get the same execution time as with shifts. – 0___________ Aug 10 '17 at 08:28
glibc internally uses an [`_itoa_word` function](https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/_itoa.c;hb=68dc02d1dcbfb37ee22327d6a3c43f528d593035#l162) which takes a variable base, but it does both of the optimizations I suggested: it uses `switch(base)` to special-case bases 8, 10, and 16, and it counts backward from the end of the caller's buffer (which is what the caller passes). `objdump -drwC -Mintel /lib/libc.so.6 | less` on my desktop shows asm that looks exactly like that function, and there are calls to it from other glibc functions. – Peter Cordes Aug 10 '17 at 08:36
`idiv r64` is one of the slowest x86 integer instructions. See https://stackoverflow.com/a/40355466/224132, where I went into more detail about performance. The latency is 32-96 cycles on Intel Haswell, and throughput almost as bad. A shift or AND each have 1 cycle latency, and can run two or four per clock cycle, respectively. ARM may be different. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0460d/Cfahhdji.html says the latency is high for small divisors, like x86. Are you sure your test didn't inline into a caller with a compile-time-constant base? – Peter Cordes Aug 10 '17 at 08:44

score 1 · Answer 6 · answered Aug 09 '17 at 19:06

Decimal -> Hex

Just iterate throught string and every character convert to int, then you can do

printf("%02x", c);

or use sprintf for saving to another variable

Hex -> Decimal

Code

printf("%c",16 * hexToInt('F') + hexToInt('0'));


int hexToInt(char c)
{
    if(c >= 'a' && c <= 'z')
        c = c - ('a' - 'A');

    int sum;

    sum = c / 16 - 3;
    sum *= 10;
    sum += c % 16;

    return (sum > 9) ? sum - 1 : sum;
}

Gryphon · Answer 7 · 2017-08-09T23:19:25.480

The articles below compare different methods of converting digits to string, hex numbers are not covered but it seems not a big problem to switch from dec to hex

Integers

Fixed and floating point

@EDIT Thank you for pointing that the answer above is not relevant. Common way with no LUT is to split integer into nibbles and map them to ASCII

#include <stdio.h>
#include <stdint.h>
#include <string.h>

#define HI_NIBBLE(b) (((b) >> 4) & 0x0F)
#define LO_NIBBLE(b) ((b) & 0x0F)

void int64_to_char(char carr[], int64_t val){
    memcpy(carr, &val, 8);
}

uint64_t inp = 0xF05C1E3A;
char tmp_st[8];

int main()
{
    int64_to_char(tmp_st,inp);
    printf("Sample: %x\n", inp);
    printf("Result: 0x");
    for (unsigned int k = 8; k; k--){
        char tmp_ch = *(tmp_st+k-1);
        char hi_nib = HI_NIBBLE(tmp_ch);
        char lo_nib = LO_NIBBLE(tmp_ch);
        if (hi_nib || lo_nib){
            printf("%c%c",hi_nib+((hi_nib>9)?55:48),lo_nib+((lo_nib>9)?55:48));
        }
     }
     printf("\n");
    return 0;
}

Another way is to use Allison's Algorithm. I am total noob in ASM, so I post the code in the form I googled it.

Variant 1:

ADD AL,90h
DAA
ADC AL,40h
DAA

Variant 2:

CMP  AL, 0Ah
SBB  AL, 69h
DAS

Links do no make a good answer, because they are not *your* answer, and because they may vanish. More of a comment. — Weather Vane, Aug 09 '17 at 19:30
And converting binary value into hexadecimal string through ordinary math arithmetic is horrible, as the 4-bit groups directly map to single hexadecimal digit, so there's no math involved at all. I.e. conversion to hexa is nowhere similar to conversion to decimal, which is much more tough problem for computers. Hexa is just "remapping" bits without any [serious] calculation. — Ped7g, Aug 09 '17 at 19:32

Efficient Conversion of a Binary Number to Hexadecimal String

7 Answers7

Linked