1

I'm looking for a fast way to convert uint32_t to int32_t. I can assume that all numbers are smaller than 2^31-1 and do not need to worry about overflow at this point in the code. Can I assume that the sign bit of an int32_t is the largest digit of an uint32_t? In terms of coding can I treat a pointer to a uint32_t value as a pointer to an int32_t value: Is the following code valid platform independent C code and will the last statement be true? (It is true on my Mac and with the intel compilers on linux)

uint32_t ui = 3;
int32_t  *i = &ui;

(int32_t)ui == *i;
mwh
  • 51
  • 5
  • 2
    C doesn’t claim that you can, it doesn’t require a specific binary representation. However, pretty much all architectures use two’s complement where you could do this up to 2^31-1. If they used one’s complement the same would apply also, as well as if they used a sign bit. So it works pretty universally – Sami Kuhmonen Jun 09 '18 at 12:07
  • @SamiKuhmonen: It does require 2's complement for `int32_t`, though. It just doesn't guarantee that that type exists. – rici Jun 09 '18 at 14:10
  • @mwh did you mean that you can guarantee that all numbers are less than 2^31? Because *you* don't have to guarantee that a `uint32_t` is less than 2^32; that's the *definition* of an unsigned fixed width type. – rici Jun 09 '18 at 14:15
  • @rici sure, changed it in the question. – mwh Jun 10 '18 at 11:48

2 Answers2

3

If you want it fast, and you're not concerned about overflow, you don't have to do anything fancy at all. If you simply write

i = ui;

any sane compiler is going to emit at most a single assignment instruction, although it will probably warn you about the type mismatch and the potential for overflow. If you write

i = (int32_t)ui;

any sane compiler will emit precisely the same assignment instruction, and will swallow its doubts (i.e. not warn you) about the type mismatch.

There's no need whatsoever to get involved with pointer aliasing or type punning -- this would only obfuscate the code, and potentially make it less reliable or less efficient.

If you're worried about conversions, and things like where the sign bit is, you really don't have to:

  • In two's complement arithmetic, no conversion is required, unless it's to detect overflow and throw an exception instead. In one's complement and sign-magnitude, no conversions are required, either, again unless to detect overflow and throw an exception.

  • Although it's wise not to make machine-dependent assumptions, for all intents and purposes you can assume that your code will run on a two's complement machine. (They're pretty much universal.)

  • Even if a conversion were required, it would not likely be appreciably slower than a straight assignment anyway.

  • By writing a straightforward assignment in C (as opposed to some pointer gyration), you ensure that, even if some conversion is somehow required, your compiler will take care of it.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • 2
    Minor remark, this has nothing to do with two's complement machine. `int32_t` is two's complement by definition. And all the same observation would hold for any signed/unsigned pair of integer types. As long as the value fits into the signed type, there is nothing to do. – Jens Gustedt Jun 09 '18 at 14:00
  • 2
    Representations other than 2s-complement that are relevant here. `int32_t` is guaranteed to be 2s-complement if it exists. It is guaranteed to be 32 bits wide with no padding. `uint32_t` is guaranteed to be the corresponding unsigned type (and must exist if `int32_t` exists). Corresponding integer types are guaranteed to have their non-sign bits in the same place. – rici Jun 09 '18 at 14:23
1

No, this is not platform independent C code. The C standard does not require int32_t nor uint32_t to be defined.

I suspect you don't actually care about platform independent C code to all implementations ever though, I suspect you only care about C code that will work on the platforms you care about, but I do not know which implementations you care about.

Assuming int32_t and uint32_t exist:

Using a pointer conversion to access an unsigned type as if it were the corresponding signed type, or vice versa, is valid. int32_t is the signed type corresponding to uint32_t. As pointed out by @rici, that's not just true in practise, it's a hard requirement.

At the same time, this is needlessly complicated. On current platforms in common use, a simple direct cast from uint32_t to int32_t or vice versa, without any pointer tricks, will require zero instructions at run time and do exactly what you'd expect already.

  • *"Using a pointer conversion to access an unsigned type as if it were the corresponding signed type, or vice versa, is valid."* - actually it is undefined behavior. – John Zwinck Jun 09 '18 at 12:17
  • @JohnZwinck Actually, no, it's not. Accessing arbitrary types as if they were some other type, sure, but there's a special exception for corresponding integer types. –  Jun 09 '18 at 12:25
  • 1
    @JohnZwinck: C 2011 [N1570] 6.5 7 says “An object shall have its stored value accessed only by an lvalue expression that has one of the following types… a type that is the signed or unsigned type corresponding to the effective type of the object.” Do you think some other part of the C standard makes this undefined? Or is it the pointer conversion, not the access, you are concerned about? – Eric Postpischil Jun 09 '18 at 12:25
  • @EricPostpischil: This page explicitly says that `char`, `signed char`, and `unsigned char` are not compatible types (and therefore must not alias): http://en.cppreference.com/w/c/language/type (yes it's a C++-named site, but this is the C section). – John Zwinck Jun 09 '18 at 12:26
  • @JohnZwinck: I have quoted from the C standard (albeit the final draft before the official version). That is authoritative. cppreference.com is not authoritative. – Eric Postpischil Jun 09 '18 at 12:29
  • 1
    @JohnZwinck "Compatible type" means something else and is not relevant to aliasing questions. "Compatible type" means that as e.g. `int[]` and `int[3]` are compatible types, `int a[]; int a[3];` is a valid redeclaration. Obviously `signed char a; unsigned char a;` is not supposed to be valid. –  Jun 09 '18 at 12:29
  • @hvd: Here's an SO answer with 469 points that says otherwise (and links to the same page I linked above): https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule *"The strict aliasing rule makes this setup illegal: dereferencing a pointer that aliases another of an incompatible type is undefined behavior."* – John Zwinck Jun 09 '18 at 12:31
  • 1
    @JohnZwinck That same answers goes on to mention that there are exceptions. It only mentions `char` (and `signed` and `unsigned` variants) as an exception, but there are more. (Can't edit my earlier comment, but I'll admit "not relevant" was too strong, it should've been "not all that's relevant".) –  Jun 09 '18 at 12:33
  • @JohnZwinck: Stack Overflow answers are not authoritative and do not override the explicit language of the C standard. – Eric Postpischil Jun 09 '18 at 12:35
  • @EricPostpischil: To be honest I don't know if the excerpt from the C standard you quoted does or does not mean what you say it does. It's too obtuse. If you think a 469-point answer here is wrong, I do hope you'll go and edit it so we don't spread misinformation. – John Zwinck Jun 09 '18 at 13:47
  • @hvd `int32_t` and `uint32_t` are required to be corresponding integer types, if they exist. 7.20.1 paragraph 1: "When typedef names differing only in the absence or presence of the initial u are defined, they shall denote corresponding signed and unsigned types as described in 6.2.5; an implementation providing one of these corresponding types shall also provide the other." – rici Jun 09 '18 at 14:03
  • @JohnZwinck: C 2011 6.5 7 **is** the anti-aliasing rule in C. It is the source from which all other statements about anti-aliasing derive. It is not that obtuse. It says an object shall only be accessed by the listed types of expressions. That is the source of the anti-aliasing rule: Accessing an object with a type that is not one of those listed violates this paragraph. A type that is the signed or unsigned type corresponding to the effective type of the object is listed, so it is allowed. – Eric Postpischil Jun 09 '18 at 14:25