0

I'm trying to convert Hexadecimal string to binary, which will be passed to encryption Library later (openssl AES EVP_xx), so i guess i just need a Binary string with nul teminator.

I've used 2nd test demonstrated by Josch His Test case is pretty powerful, so I'm ending up using lookup table. I'm Just changed the input variables and added printing but alwys get garbage printed.

I've only added the following:

  • malloc (expected binary string length + 1)

  • set the nul terminator at the end of the loop. AFAIK these garbage character returned when string is not null terminated.

      #include <string.h>
      #include <stdio.h>
      #include <stdlib.h> 
    
      /* the resulting binary string is half the size of the input hex string
      * because every two hex characters map to one byte */
    
      unsigned char base16_decoding_table1[256] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,9,0,0,0,0,0,0,0,10,11,12,13,14,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,11,12,13,14,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,};
    
      int main(int argc,  char **argv){
    
      int TESTDATALEN=strlen(( char*) argv[1]);
    
      //char *result=malloc((TESTDATALEN/2)+1);
      int mallocLen=(TESTDATALEN%2) ? ((TESTDATALEN+3)/2) : ((TESTDATALEN/2)+1);
      unsigned char *result=malloc(mallocLen);
      *result='\0';
    
      int i;
      unsigned char cur;
      unsigned char val;
    
      for (i = 0; i < TESTDATALEN; i++) {
          cur = argv[1][i];
          val = base16_decoding_table1[(int)cur];
    
      /* even characters are the first half, odd characters the second half
              * of the current output byte */
          if (i%2 == 0) {
                  result[i/2] = val << 4;
              } else {
                  result[i/2] |= val;
              }
          }
      result[mallocLen] = '\0';
    
      printf ("Binary value: %s\n", result);
      //for (i = 0; i < TESTDATALEN/2; i++) printf ("%02hhx", result[i]); 
      //putchar ('\n');
    
      free(result);
    
      return 0;
    
      }
    

I might add more to the context if it's relevant; i can't touch strcat for performance reasons since the hexadecimal to binary is just one step in the encryption which is performed on thousand (or milllions) of input, the single field can reach 8kb so can't use strtol either and have to do it in strings,

Current output is:

hex2bin1.6.2 abc123def789

��#�

While Binary equivalent for this value is: 101010111100000100100011110111101111011110001001

The target is feed this binary string returned from this conversion code to openssl AES EVP encryption routines, which accepts and returns binary input.

Thanks in advance

Kineticx
  • 1
  • 3
  • Hmmm, your code looks [working well](https://wandbox.org/permlink/lMXtwEjUJSCpuOIW). – MikeCAT Dec 06 '20 at 09:17
  • There's at least two fundamental errors here. One is that your output buffer is too small (it's off by one when the input string has odd length). Second is that `char` may be signed, so `base_decoding_table[(int)cur]` may be out of range. On a more stylistic note, you do a lot of aimless casts. Your question would also be better if you provided a hard-coded example string which fails, and what output you would expect. – Paul Hankin Dec 06 '20 at 09:17
  • You `#include "base_encoding_table.h"` but that's not provided in the question. That could also be a source of problems. – Paul Hankin Dec 06 '20 at 09:21
  • Why `printf ("Binary value: %s\n", result);` instead of `for (i = 0; i < TESTDATALEN/2; i++) printf ("%02hhx", result[i]); putchar ('\n');`? `result` may be type `char` and nul-terminated -- but it certainly does not hold ASCII character values... – David C. Rankin Dec 06 '20 at 09:56
  • @MikeCAT thanks for tetsing it, the output is still symbols, i need 0's & 1's; I've added better explanation at the end – Kineticx Dec 07 '20 at 04:54
  • @PaulHankin Thanks for your comments: 1: malloc had been corrected to account for odd & even input; 2:All char switched to unsigned; 3: it's just empty file with the fn content, NVM, i've removed it; – Kineticx Dec 07 '20 at 04:55
  • @DavidC.Rankin Thanks for your feedback; Maybe I'm on a total misnderstanding ! i need strig with 0's & 1's which will be accessed by encryption fn. using your printf returned the exact same input – Kineticx Dec 07 '20 at 04:58
  • Yes, `printf` with `"%s"` expects a nul-terminated string of ASCII characters, see [ASCII Table & Description](http://www.asciitable.com/). You are doing a valid conversion from hex -- but you do not have a *string* when done. Understand when the conversion is complete -- you have the binary. Everything your computer does is stored in binary. If you want to see the binary, you just have to output the representation of the bits in memory. Using `printf` with `"%d"` just gives a signed-integer view of the binary bits in memory, `"%x"` a hexadecimal view of the same binary bits. – David C. Rankin Dec 07 '20 at 05:02
  • [code run result](https://wandbox.org/permlink/3lM1M0I757KOTcaD) – Kineticx Dec 07 '20 at 05:02

1 Answers1

1

Alright. Let's get you straightened out. The biggest misconception you have is that the contents of result will be a string printable with puts() or printf() using the "%s" conversion specifier. The values in result are not ASCII values, they are the integer values from lookup on base16_decoding_table1[]. It doesn't contain ASCII character values. If it did contain character values, the every element in the initializer would be surrounded with single quotes, e.g. '0','0','0',... and the entries for indexes 65-70 would be 'A','B','C','D','E','F',... See ASCII Table & Description.

As mentioned in the comment, all numbers are stored in memory in binary. When you look at an integer value has a decimal value, an octal value or a hex value, those are all just different view from the same binary bits in memory. If you want to look at the actual binary bits, you just need to output a representation of the bit directly from memory.

There are a number of way to do it, but a simple unpadded binary output of '0's and '1' corresponding to the binary 0s and 1s in memory can be done with:

 /** unpadded binary representation of 'v'. */
void binprn (const unsigned long v)
{
    if (!v)  { putchar ('0'); return; };

    size_t sz = sizeof v * CHAR_BIT;
    unsigned long rem = 0;

    while (sz--)
        if ((rem = v >> sz))
            putchar ((rem & 1) ? '1' : '0');
}

Now you simply need to take the hex input as the program argument, convert it to a stored value and then output the binary representation for the number. Since you are dealing with unsigned numbers, the strtoul() function is a simple way to take hex character input and convert it to a stored unsigned value in memory. That is pointed out in your updated code below:

(Edit: to set result size if TESTDATALEN odd or 1, and note size result holds 2-characters per-byte, outputting bytes for odd TESTDATALEN will be zero padded.)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>      /* for errno for strtoul validation */
#include <limits.h>     /* for CHAR_BIT */

/* the resulting binary string is half the size of the input hex string
 * because every two hex characters map to one byte */

unsigned char base16_decoding_table1[256] = {
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,9,0,0,0,0,0,0,0,10,11,12,13,14,15,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,11,12,13,14,15,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 };

/** unpadded binary representation of 'v'. */
void binprn (const unsigned long v)
{
    if (!v)  { putchar ('0'); return; };

    size_t sz = sizeof v * CHAR_BIT;
    unsigned long rem = 0;

    while (sz--)
        if ((rem = v >> sz))
            putchar ((rem & 1) ? '1' : '0');
}

int main(int argc, char **argv){
    
    if (argc < 2)       /* validate 1 argument given */
        return 1;
    
    size_t  TESTDATALEN = strlen(argv[1]),
            resultsz = TESTDATALEN / 2 + TESTDATALEN % 2; /* must have 1 char */
    
    unsigned char *result = calloc (resultsz, 1);   /* initialize result all 0 */
    
    errno = 0;          /* zero errno before call to strtoul */
    
    char *endptr;       /* end-pointer for strtoul validation */
    size_t i;
    int cur;
    unsigned char val;
    unsigned long value = strtoul (argv[1], &endptr, 16);   /* convert input */
    
    if (endptr == argv[1]) {    /* check if no digits converted */
        fputs ("error: invalid hex format - no digits converted.\n", stderr);
        return 1;
    }
    else if (errno) {           /* check for under/overflow */
        fputs ("error: overflow in conversion to hex.\n", stderr);
        return 1;
    }

    for (i = 0; i < TESTDATALEN; i++) {
        cur = argv[1][i];

        val = base16_decoding_table1[cur];
    
        /* even characters are the first half, odd characters the second half
         * of the current output byte */
        if (i % 2 == 0)
            result[i/2] = val << 4;
        else
            result[i/2] |= val;
    }
    
    printf ("hex value: %lx\nresult   : ", value);  /* output stored value */
    
    for (i = 0; i < resultsz; i++)                  /* output bytes in result */
        printf ("%02hhx", result[i]);
    putchar ('\n');
    
    fputs ("binary   : ", stdout);                  /* output binary of value */
    binprn (value);
    putchar ('\n');
    
    free(result);
}

Example Use/Output

Now when you provide a character representation for a valid hexadecimal number your get the hex representation of what is stored value output, followed by the bytes stored in result, followed by the binary representation of what is stored in value and in byte form in result, e.g.

$ ./bin/base16decode aaff
hex value: aaff
result   : aaff
binary   : 1010101011111111

Look things over and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85