1

I have a char array say char value []={'0','2','0','c','0','3'};

I want to convert this into a byte array like unsigned char val[]={'02','0c','03'}

This is in an embedded application so i can't use string.h functions. How can i do this?

luser droog
  • 18,988
  • 3
  • 53
  • 105
ganeshredcobra
  • 1,881
  • 3
  • 28
  • 44
  • First of all, if you want the characters to be combined into byte-values, you can't have the destination with multi-character literals. – Some programmer dude Jan 29 '14 at 06:40
  • 3
    Did you mean `unsigned short val[]={0x02,0x0c,0x03}`? – Marian Jan 29 '14 at 06:44
  • 1
    @Marian Yes i exactly want it like this but unsigned char can also hold 0x02 etc right – ganeshredcobra Jan 29 '14 at 06:46
  • 1
    @ganeshredcobra: `char`, `signed char` and `unsigned char` all of them can hold _0x02_ since it's just 2. – legends2k Jan 29 '14 at 07:29
  • 1
    @ganeshredcobra So to be clear, you receive [nibbles](http://en.wikipedia.org/wiki/Nibble) as hexadecimal digit characters, and you want to combine each pair into one byte? How do you know the length, is it always 6 chars == 3 bytes? – hyde Jan 29 '14 at 07:43
  • I think you're confusing people by the data type for your destination array. Most answers assume, that you want to do put always two numbers together AS A STRING, since your destination type is also `char`. I think you should make clear if you want to the final data in characte or in integer form. – Toby Jan 29 '14 at 07:49
  • If this turns out to *not* be each character representing a nibble in a byte (therefore 2 per output byte) i'll be surprised. – WhozCraig Jan 29 '14 at 08:20
  • @hyde from the comments below, length is variable but up to 8 or 10, presumably always even. – luser droog Jan 29 '14 at 08:34
  • This is a simple hex to binary conversion. You should be able to code one in your sleep. – Hot Licks Jan 29 '14 at 21:38

4 Answers4

4

Sicne you talk about an embedded application I assume that you want to save the numbers as values and not as strings/characters. So if you just want to store your character data as numbers (for example in an integer), you can use sscanf.

This means you could do something like this:

 char source_val[] = {'0','A','0','3','B','7'} // Represents the numbers 0x0A, 0x03 and 0xB7
 uint8 dest_val[3];                            // We want to save 3 numbers
 for(int i = 0; i<3; i++)
 {
     sscanf(&source_val[i*2],"%x%x",&dest_val[i]); // Everytime we read two chars --> %x%x
 }
 // Now dest_val contains 0x0A, 0x03 and 0xB7

However if you want to store it as a string (like in your example), you can't use unsigned char since this type is also just 8-Bit long, which means it can only store one character. Displaying 'B3' in a single (unsigned) char does not work.

edit: Ok according to comments, the goal is to save the passed data as a numerical value. Unfortunately the compiler from the opener does not support sscanf which would be the easiest way to do so. Anyhow, since this is (in my opinion) the simplest approach, I will leave this part of the answer at it is and try to add a more custom approach in this edit.

Regarding the data type, it actually doesn't matter if you have uint8. Even though I would advise to use some kind of integer data type, you can also store your data into an unsigned char. The problem here is, that the data you get passed, is a character/letter, that you want to interpret as a numerical value. However, the internal storage of your character differs. You can check the ASCII Table, where you can check the internal values for every character. For example:

char letter = 'A'; // Internally 0x41 
char number = 0x61; // Internally 0x64 - represents the letter 'a'

As you can see there is also a differnce between upper an lower case.

If you do something like this:

int myVal = letter;  //

myVal won't represent the value 0xA (decimal 10), it will have the value 0x41.

The fact you can't use sscanf means you need a custom function. So first of all we need a way to conver one letter into an integer:

int charToInt(char letter)
{
    int myNumerical;
    // First we want to check if its 0-9, A-F, or a-f) --> See ASCII Table
    if(letter > 47 && letter < 58)
    {
        // 0-9
        myNumerical = letter-48;
        // The Letter "0" is in the ASCII table at position 48 -> meaning if we subtract 48 we get 0 and so on...
    }
    else if(letter > 64 && letter < 71)
    {
       // A-F
       myNumerical = letter-55 
       // The Letter "A" (dec 10) is at Pos 65 --> 65-55 = 10 and so on..
    }
    else if(letter > 96 && letter < 103)
    {
       // a-f
       myNumerical = letter-87
       // The Letter "a" (dec 10) is at Pos 97--> 97-87 = 10 and so on...
    }
    else
    {
       // Not supported letter...
       myNumerical = -1;
    }
    return myNumerical;
}

Now we have a way to convert every single character into a number. The other problem, is to always append two characters together, but this is rather easy:

int appendNumbers(int higherNibble, int lowerNibble)
{
     int myNumber = higherNibble << 4;
     myNumber |= lowerNibbler;
     return myNumber;
    // Example: higherNibble = 0x0A, lowerNibble = 0x03;  -> myNumber 0 0xA3
    // Of course you have to ensure that the parameters are not bigger than 0x0F 
}

Now everything together would be something like this:

 char source_val[] = {'0','A','0','3','B','7'} // Represents the numbers 0x0A, 0x03 and 0xB7
 int dest_val[3];                             // We want to save 3 numbers
 int temp_low, temp_high;
 for(int i = 0; i<3; i++)
 {
     temp_high = charToInt(source_val[i*2]);
     temp_low = charToInt(source_val[i*2+1]);
     dest_val[i] = appendNumbers(temp_high , temp_low);
 }

I hope that I understood your problem right, and this helps..

wrapperapps
  • 937
  • 2
  • 18
  • 30
Toby
  • 3,815
  • 14
  • 51
  • 67
  • sscanf wis not available in compiler also uint8 – ganeshredcobra Jan 29 '14 at 08:43
  • Ok...could you please clarify if you want to save the data you receive as a literal (for example `string myNumber = "0xAB"`) - or do you want to save it numerical (for example `int myNumber = 0xAB`) ? Because there is a huge difference. The string would save 4 letters 0,x,A,B where as the integer (or any other numerical type) would save it as AB -> 171 decimal – Toby Jan 29 '14 at 09:01
  • i want as int myNumber = 0xAB – ganeshredcobra Jan 29 '14 at 09:14
  • Ok, I guess I understood your problem now (at least I hope so)...just check the edit and let me know if it helped – Toby Jan 29 '14 at 09:44
  • yes it helped a lot but the code has issues when the values are FF or AA AB etc its not giving corect values always – ganeshredcobra Jan 29 '14 at 10:28
  • Gonna look into it after lunch – Toby Jan 29 '14 at 11:18
  • What is the method for reverse conversion in an array its hex values convert it to char or ascii values – ganeshredcobra Feb 03 '14 at 11:34
  • The reverse action should be done the same way around. You have your integer, then you split it up into two numericals (0xA3 in A/10 and 3). Then you convert them to their equivalent char (see ASCII table): – Toby Feb 03 '14 at 11:41
3

If you have a "proper" array, like value as declared in the question, then you loop over the size of it to get each character. If you're on a system which uses the ASCII alphabet (which is most likely) then you can convert a hexadecimal digit in character form to a decimal value by subtracting '0' for digits (see the linked ASCII table to understand why), and subtracting 'A' or 'a' for letters (make sure no letters are higher than 'F' of course) and add ten.

When you have the value from the first hexadeximal digit, then convert the second hexadecimal digit the same way. Multiply the first value by 16 and add the second value. You now have single byte value corresponding to two hexadecimal digits in character form.


Time for some code examples:

/* Function which converts a hexadecimal digit character to its integer value */
int hex_to_val(const char ch)
{
    if (ch >= '0' && ch <= '9')
        return ch - '0';  /* Simple ASCII arithmetic */
    else if (ch >= 'a' && ch <= 'f')
        return 10 + ch - 'a';  /* Because hex-digit a is ten */
    else if (ch >= 'A' && ch <= 'F')
        return 10 + ch - 'A';  /* Because hex-digit A is ten */
    else
        return -1;  /* Not a valid hexadecimal digit */
}

...

/* Source character array */
char value []={'0','2','0','c','0','3'};

/* Destination "byte" array */
char val[3];

/* `i < sizeof(value)` works because `sizeof(char)` is always 1 */
/* `i += 2` because there is two digits per value */
/* NOTE: This loop can only handle an array of even number of entries */
for (size_t i = 0, j = 0; i < sizeof(value); i += 2, ++j)
{
    int digit1 = hex_to_val(value[i]);      /* Get value of first digit */
    int digit2 = hex_to_val(value[i + 1]);  /* Get value of second digit */

    if (digit1 == -1 || digit2 == -1)
        continue;  /* Not a valid hexadecimal digit */

    /* The first digit is multiplied with the base */
    /* Cast to the destination type */
    val[j] = (char) (digit1 * 16 + digit2);
}

for (size_t i = 0; i < 3; ++i)
    printf("Hex value %lu = %02x\n", i + 1, val[i]);

The output from the code above is

Hex value 1 = 02
Hex value 2 = 0c
Hex value 3 = 03

A note about the ASCII arithmetic: The ASCII value for the character '0' is 48, and the ASCII value for the character '1' is 49. Therefore '1' - '0' will result in 1.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • I like your approach, it actually feels a bit more slim than mine. But I would change the return value from `hex_to_val` to `-1` for an invalid char, because `0` can also be a valid return value. So there is no way to distinguish if the character was actually `0` or invalid! – Toby Jan 29 '14 at 12:14
  • @Toby I was thinking about that too, don't really know why I didn't do it. Updated it now though. Thanks for the remainder. – Some programmer dude Jan 29 '14 at 12:18
2

It's easy with strtol():

#include <stdlib.h>
#include <assert.h>

void parse_bytes(unsigned char *dest, const char *src, size_t n)
{
    /** size 3 is important to make sure tmp is \0-terminated and
        the initialization guarantees that the array is filled with zeros */
    char tmp[3] = "";

    while (n--) {
        tmp[0] = *src++;
        tmp[1] = *src++;
        *dest++ = strtol(tmp, NULL, 16);
    }
}

int main(void)
{
    unsigned char d[3];
    parse_bytes(d, "0a1bca", 3);
    assert(d[0] == 0x0a);
    assert(d[1] == 0x1b);
    assert(d[2] == 0xca);
    return EXIT_SUCCESS;
}

If that is not available (even though it is NOT from string.h), you could do something like:

int ctohex(char c)
{
    if (c >= '0' && c <= '9') {
        return c - '0';
    }
    switch (c) {
        case 'a':
        case 'A':
            return 0xa;

        case 'b':
        case 'B':
            return 0xb;

        /**
         * and so on
         */
    }
    return -1;
}

void parse_bytes(unsigned char *dest, const char *src, size_t n)
{
    while (n--) {
        *dest = ctohex(*src++) * 16;
        *dest++ += ctohex(*src++);
    }
}
Brave Sir Robin
  • 1,046
  • 6
  • 9
  • cant use string.h anyway in that device – ganeshredcobra Jan 29 '14 at 07:24
  • 2
    @luserdroog When you initialize fixed size array (or struct), and leave part of it uninitialized, then that part will be initialized to zeros. More for example [here](http://stackoverflow.com/questions/1065774/c-c-initialization-of-a-normal-array-with-one-default-value). – hyde Jan 29 '14 at 07:25
  • @ganeshredcobra strtol is from stdlib.h, nevertheless I also edited the post to include an alternative witthout it. – Brave Sir Robin Jan 29 '14 at 07:31
  • @rmartinjak working on microcontroller it doesnt use stdlib.h also – ganeshredcobra Jan 29 '14 at 07:33
  • @hyde and martinjak. I stand corrected. c89.3.5.7: " If there are fewer initializers in a list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration." and "If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant." – luser droog Jan 29 '14 at 07:44
  • @ganeshredcobra the second code block shows an alternative without strtol. The switch table is a bit lengthy, but it doesn't make any assumptions about the character set. If you're sure the letters are contiguous in your target platform's character set, you can use arithmetic instead. You also may or may not want to add proper error checking. – Brave Sir Robin Jan 29 '14 at 07:46
1
Assuming 8-bit bytes (not actually guaranteed by the C standard, but ubiquitous), the range of `unsigned char` is 0..255, and the range of `signed char` is -128..127. ASCII was developed as a 7-bit code using values in the range 0-127, so the same value can be represented by both `char` types.

For the now discovered task of converting a counted hex-string from ascii to unsigned bytes, here's my take:

unsigned int atob(char a){
    register int b;
    b = a - '0';    // subtract '0' so '0' goes to 0 .. '9' goes to 9
    if (b > 9) b = b - ('A' - '0') + 10;  // too high! try 'A'..'F'
    if (b > 15) b = b - ('a' - 'A);  // too high! try 'a'..'f'
    return b;
}

void myfunc(const char *in, int n){
    int i;
    unsigned char *ba;
    ba=malloc(n/2);
    for (i=0; i < n; i+=2){
        ba[i/2] = (atob(in[i]) << 4) | atob(in[i+1]);
    }
    // ... do something with ba
}
luser droog
  • 18,988
  • 3
  • 53
  • 105
  • Am getting this array as an char array char value []={'0','2','0','c','0','3'}; How will i convert it to the array you mentioned its not clear? – ganeshredcobra Jan 29 '14 at 06:55
  • Expanded the answer a little. Not entirely sure what you need. – luser droog Jan 29 '14 at 06:58
  • Am receiving the data as char array from UART of a microcontroller i want convert it to a byte array that val[0] must be MSB and val[1] must be LSB .Like this i want to convert the whole char array to byte array. – ganeshredcobra Jan 29 '14 at 07:02
  • `char` and `byte` are the same thing. Are these pieces of larger (multi-byte) values? – luser droog Jan 29 '14 at 07:03
  • Nope its not part of large value the length of the array which i receive will be max 8 or 10. I want to merge val[0] and val[1] How can i do that by bit shifting or something like that?I Think the value am getting in char array is ascii value. – ganeshredcobra Jan 29 '14 at 07:07
  • Can you edit the question to add this (and more) information? What do you mean by wanting to "merge" the two values? If val[0] is the MSB and val[1] the LSB of a 16-bit (2-byte) unsigned integer (`unsigned short`), then you can re-form the integer with `(val[0]<<8)|val[1]` or `(unsigned)val[0]*256+val[1]`. – luser droog Jan 29 '14 at 07:12
  • I think the problem is, that he gets the data as `char` , maybe over some kind of protocol/interface (so he has no influence on how he gets the data). Meaning this if he gets the actualy value 0x3B he will get two characters containing '3' and 'B'. If he just passes/copies this, he won't get 3B in his destination value, since these characters are internally displayed with ASCII. – Toby Jan 29 '14 at 07:46
  • @luserdroog I believe they're supposed to represent nibbles of a byte, not bytes of a 16-bit word. i.e. `02` represents a *byte* `00000010`. may be wrong on that, but I don't think so. – WhozCraig Jan 29 '14 at 08:17
  • Took some effort to tease that out, but yes. Then I thought I should leave the answer just so these comments would remain for others. Finally, I guess I have to write the damned function! :) – luser droog Jan 29 '14 at 08:19
  • @luserdroog tease what out of me? I just got here =P – WhozCraig Jan 29 '14 at 08:21
  • No, I meant "tease out [the goal from the question]". – luser droog Jan 29 '14 at 08:22
  • +1 for your atob function. But why is b `unsigned char` and not for ex. `register int`? – Marian Jan 29 '14 at 15:30