22

How do I copy a char* to a unsigned char* correctly in C. Following is my code

int main(int argc, char **argv)
{
    unsigned char *digest;

    digest = malloc(20 * sizeof(unsigned char));
    strncpy(digest, argv[2], 20);
    return 0;
}

I would like to correctly copy char* array to unsigned char* array. I get the following warning using the above code

warning: pointer targets in passing argument 1 of âstrncpyâ differ in signedness 

EDIT: Adding more information, My requirement is that the caller provide a SHA digest to the main function as a string on command line and the main function internally save it in the digest. SHA digest can be best represented using a unsigned char.

Now the catch is that I can't change the signature of the main function (** char) because the main function parses other arguments which it requires as char* and not unsigned char*.

Rajiv
  • 545
  • 1
  • 6
  • 12
  • 2
    A hash digest is typically expressed as an ASCII representation of the hex value of the digest (e.g. "`b6379dab2c...`"). A `char` is absolutely fine for this! – Oliver Charlesworth Aug 04 '11 at 11:24
  • @oli So basically the cast should also work fine without any problems strncpy((char*)digest, argv[2], 20); since we are dealing with ASCII? – Rajiv Aug 04 '11 at 11:26
  • @Rajiv: there are two different ways to represent an SHA-1 digest, which is 160 bits. One of those ways is to use 20 8-bit bytes, and `unsigned char` is the best type for this. The other way is to use an ASCII representation in which each character is a hexadecimal digit, representing 4 bits, and hence 40 of them are required. Clearly `strncpy` isn't going to convert between them. – Steve Jessop Aug 04 '11 at 11:55
  • @Steve: Yeah I am using the unsigned char version with 20 8bits. If strncpy cannot would memcpy or any other function would do the trick? – Rajiv Aug 04 '11 at 14:34
  • 1
    @Rajiv: how do you think the user is going to type those 8-bit values at the terminal? What if one of them is 0? – Steve Jessop Aug 04 '11 at 15:05

7 Answers7

15

To avoid the compiler warning, you simply need:

strncpy((char *)digest, argv[2], 20);

But avoiding the compiler warning is often not a good idea; it's telling you that there is a fundamental incompatibility. In this case, the incompatibility is that char has a range of -128 to +127 (typically), whereas unsigned char is 0 to +255.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • Yeah, thats the problem, how do I solve the incompatibility in a better way? – Rajiv Aug 04 '11 at 11:12
  • If you could tell us why you need in as an unsigned char, that might help us answer? To take a guess at a better solution you maybe should be using a structure or union instead of a blob of unsigned char memory. – noelicus Aug 04 '11 at 11:32
  • In the case of `char *` vs `unsigned char *`, the warning (which, per the standard, the compiler is supposed to treat as an error!) is rarely indicative of any bug except a bug in the standard. Almost all of the standard functions take `char *` but deal with data which is really treated as an array of `unsigned char`. See `strcmp`. – R.. GitHub STOP HELPING ICE Aug 04 '11 at 15:17
  • @R..: What do you mean "treated as an array of `unsigned char`"? – Oliver Charlesworth Aug 04 '11 at 15:25
  • I gave `strcmp` as an example. It's required to make its comparison based on the difference between the first non-matching bytes *interpreted as `unsigned char`*. – R.. GitHub STOP HELPING ICE Aug 04 '11 at 16:31
  • There are also issues with the fact that, on a non-twos-complement implementation where `char` is signed, the value 0 could have two representations, while only the all-bits-0 byte is the null terminator. This means any function that deals with null terminated strings on such an implementation must be dealing with them as `unsigned char []` in order to tell the difference. Admittedly, any non-twos-complement implementation where plain `char` is signed would be rather stupid to begin with though... – R.. GitHub STOP HELPING ICE Aug 04 '11 at 16:33
6

You can't correctly copy it since there is difference in types, compiler warns you just about that.

If you need to copy raw bits of argv[2] array you should use memcpy function.

Petr Abdulin
  • 33,883
  • 9
  • 62
  • 96
  • With `memcpy`, you first need to check the length of `argv[2]` to avoid accessing elements outside the array. – pmg Aug 04 '11 at 11:34
2

Cast the signedness away in the strncpy() call

strncpy((char*)digest, argv[2], 20);

or introduce another variable

#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
    unsigned char *digest;
    void *tmp;                   /* (void*) is compatible with both (char*) and (unsigned char*) */

    digest = malloc(20 * sizeof *digest);
    if (digest) {
        tmp = digest;
        if (argc > 2) strncpy(tmp, argv[2], 20);
        free(digest);
    } else {
        fprintf(stderr, "No memory.\n");
    }
    return 0;
}

Also note that malloc(20 * sizeof(unsigned char*)) is probably not what you want. I think you want malloc(20 * sizeof(unsigned char)), or, as by definition sizeof (unsigned char) is 1, malloc(20). If you really want to use the size of each element in the call, use the object itself, like in my code above.

pmg
  • 106,608
  • 13
  • 126
  • 198
  • 1
    IMO, introducing a dummy variable here just obfuscates the code, with no corresponding benefit. – Oliver Charlesworth Aug 04 '11 at 11:22
  • The OP apparently wants a "better way than a cast". The obfuscated `(void*)` variable accomplishes a different way: I'll leave the decision if it's better to the OP (like you, @Oli, I think it isn't). – pmg Aug 04 '11 at 11:32
1

There is no one way to convert char * to unsigned char *. They point to data, and you must know the format of the data.

There are at least 3 different formats for a SHA-1 hash:

  • the raw binary digest as an array of exactly 20 octets
  • the digest as a hexadecimal string, like "e5e9fa1ba31ecd1ae84f75caaa474f3a663f05f4"
  • the digest as a Base64 string, like "5en6G6MezRroT3XKqkdPOmY/BfQ="

Your malloc(20 * sizeof(unsigned char)) has the exact size of a binary digest, but is too small to fit a hexadecimal string or a Base64 string. I guess that the unsigned char * points to a binary digest.

But the char * came from the command-line arguments of main(), so the char * probably points to a string. Command-line arguments are always C strings; they end with the NUL terminator '\0' and never contain '\0' in the string. Raw binary digests might contain '\0', so they don't work as command-line arguments.

The code to convert a SHA-1 digest from hexadecimal string to raw binary might look like

#include <stdio.h>
#include <stdlib.h>

unsigned char *
sha1_from_hex(char *hex)
{
    int i, m, n, octet;
    unsigned char *digest;

    digest = malloc(20);
    if (!digest)
        return NULL;

    for (i = 0; i < 20; i++) {
        sscanf(hex, " %n%2x%n", &m, &octet, &n);
        if (m != 0 || n != 2)
            goto fail;
        digest[i] = octet;
        hex += 2;
    }
    if (*hex)
        goto fail;
    return digest;

fail:
    free(digest);
    return NULL;
}

Don't use strncpy(dst, src, 20) to copy raw binary digests. The strncpy(3) function stops copying if it finds a '\0'; so if your digest contains '\0', you lose part of the digest.

George Koehler
  • 1,560
  • 17
  • 23
1

Just put (char*) in front of it or (unsigned char*)

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
1

You can use memcpy as:

memcpy(digest, argv[2], strlen(argv[2]) + 1);

as the underlying type of objects pointed to by src and dest pointers are irrelevant for this function.

cyber_raj
  • 1,780
  • 1
  • 14
  • 25
  • You have no guarantee accessing `argv[2][19]` is allowed. – pmg Aug 04 '11 at 11:33
  • I'm not sure the `'\0'` is needed in `digest`. Anyway, now you need to check that `strlen(argv[2])` is small enough for the size allocated for `digest` :) – pmg Aug 04 '11 at 12:16
  • @pmg hmm...then OP have to synchronize the allocation size for digest with the (strlen (argv[2]) + 1) * sizeof (unsigned char) – cyber_raj Aug 04 '11 at 12:29
  • On some strange machines, sizeof(char) may not be 1. eg TMS320C40. Showing my age here. – quickly_now Oct 23 '14 at 05:56
0

Warning is simply what it says , you are passing an unsigned char * digest to strncpy function which is in different signedness from what it expects.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Sandeep Pathak
  • 10,567
  • 8
  • 45
  • 57