6

I want to set a FourCC value in C++, i.e. an unsigned 4 byte integer.

I suppose the obvious way is a #define, e.g.

#define FOURCC(a,b,c,d) ( (uint32) (((d)<<24) | ((c)<<16) | ((b)<<8) | (a)) )

and then:

uint32 id( FOURCC('b','l','a','h') );

What is the most elegant way you can think to do this?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Nick
  • 27,566
  • 12
  • 60
  • 72

12 Answers12

17

You can make it a compile-time constant using:

template <int a, int b, int c, int d>
struct FourCC
{
    static const unsigned int value = (((((d << 8) | c) << 8) | b) << 8) | a;
};

unsigned int id(FourCC<'a', 'b', 'c', 'd'>::value);

With a little extra effort, you can make it check at compile time that each number passed in is between 0 and 255.

James Hopkin
  • 13,797
  • 1
  • 42
  • 71
  • very nice if the value is constant at compile time +1 – Doug T. May 01 '09 at 13:42
  • Ok, that's more intesting. :) – Nick May 01 '09 at 13:43
  • 1
    Any non-toy compiler will constant-fold the macro at compile time. – Dave May 01 '09 at 15:22
  • 2
    @Dave: that's true. This is really just a way of avoiding the macro without losing its advantages. The other replies suggesting inline functions no longer give you a compile-time constant, which may be handy. This gives some potential extra safety, but who's going to put the wrong thing in a FOURCC? To be honest, I'm sure if I'd bother with this myself, but I thought I'd put it out there. – James Hopkin May 01 '09 at 16:54
  • template might do the 0..255 check for you. – berkus May 19 '10 at 16:52
  • Won't this end up backwards on machines with different endianness? – fuzzyTew Apr 03 '11 at 02:43
  • @fuzzyTew It does the same as the macro version. Just like the macro one, it could be defined the other way round for little endian targets. – James Hopkin Apr 05 '11 at 08:55
  • To me redefinition for different endianness seems inelegant -- I was curious nobody tried to e.g. use a union with a char array. – fuzzyTew Apr 27 '11 at 13:11
  • 1
    @fuzzyTew Sanjaya did suggest that. I upvoted that answer (after initially misunderstanding it), but you can't get an integer compile-time constant that way. – James Hopkin Apr 28 '11 at 09:22
10
uint32_t FourCC = *((uint32_t*)"blah");

Why not this?

EDIT: int -> uint32_t.

And no it does not cast a char** to uint32_t. It casts a (char*) to (uint32_t*) then dereferences the (uint32_t*). There is no endian-ness involved, since its assigning an uint32_tto an uint32_t. The only defects are the alignment and the I hadn't explicitly indicated a 32bit type.

Sanjaya R
  • 6,246
  • 2
  • 17
  • 19
  • 3
    Because the value varies between big- and little-endian machines. On some architectures, that may be misaligned, resulting in a hardware exception at runtime. – Tom May 01 '09 at 14:56
  • I had considered this way and I believe it would work on all platforms we support. None of the proposals deal with endianess. TBH this isn't a concern as it will be platform specific data. – Nick May 01 '09 at 15:27
  • @James Hopkin: This isn't converting the pointer to an int, it converts the pointer to an int POINTER, then dereferences it. The value will always end up the same as if you used a union between 4 chars and an int and assigned the chars to {'b','l','a','h'}. On the topic of endianess, FOURCC codes should be in the same order in memory (http://msdn.microsoft.com/en-us/library/dd375802(VS.85).aspx alludes to this when it mentions little-endianess on Windows platforms) – Grant Peters Jun 17 '10 at 01:53
7

By using C++11 constexpr you can write something like:

constexpr uint32_t fourcc( char const p[5] )
{
    return (p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3];
}

And then use it as:

fourcc( "blah" );

pros:

  • More readable,
  • if the string argument is known at compile time, then the function is evaluated at compile time (no run-time overhead).
  • doesn't depend on endianity (i.e. the first character of the argument will always be in the most significant byte of the fourcc).

cons:

  • Requires c++11 (or later) compiler.
MaxP
  • 2,664
  • 2
  • 15
  • 16
  • This code does not return the correct FOURCC, you need to do `(p[3] << 24) | (p[2] << 16) | (p[1] << 8) | p[0]` instead of `(p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3]`. – Octo Poulos Nov 26 '21 at 15:09
  • @OctoPoulos - As stated, the first character of the string will end in the most significant byte (i.e. p[0] << 24). If you hexprint the fourcc integer, you'll get the ASCII code in the same order the corresponding character is in the string. Endianity comes into play when you need to store the integer on file or over the net. In that case, you may want to reverse the bytes composing the integer. You can use your solution and keep bytes in the order they'll be used, but I prefer to distinguish the data from its use. – MaxP Dec 15 '21 at 12:13
4

or do the same with an inline function

inline uint32_t FOURCC(uint8_t a, uint8_t b, uint8_t c, uint8_t d)
{
     return ( (uint32) (((d)<<24) | (uint32_t(c)<<16) | (uint32_t(b)<<8) | uint32_t(a)) )
} 

and avoid the headaches of a macro, but otherwise your approach looks fine to me.

Doug T.
  • 64,223
  • 27
  • 138
  • 202
  • I'd lose the extra paranthesis in the function version. :) – Brian Neal May 01 '09 at 14:03
  • 1
    Biggest downside is that you can't use an inline function for a case statement in a switch block. Either a template structure or a macro would work for that, though. – Tom May 01 '09 at 14:51
4

If I am not mistaken, you can just use multi-character character constants for that right?

unsigned int fourCC = 'blah';

This is perfectly valid by the ANSI/ISO specification though some compilers will complain a little. This is how resource types used to be handled in the older Macintosh APIs.

D.Shawley
  • 58,213
  • 10
  • 98
  • 113
  • 2
    I think these are implementation defined and aren't necessarily portable. – Brian Neal May 01 '09 at 14:04
  • 3
    In other words, you may not be sure where the 'b' is going to end up, high order byte or low order. – Brian Neal May 01 '09 at 14:05
  • Additionally, I think some compilers may interpret the 'blah' as a byte and not a 32 bit integer. I think the standard says it should be interpretted as an int. – Nick May 01 '09 at 14:14
  • I'll have to look this one up... FWIW, remember that 'a' is an integer constant not a character. I'll add another comment once I track this one down in the spec. – D.Shawley May 02 '09 at 03:37
  • Nice catch... I just checked the spec and it is implementation defined. In that case, I would opt for the inline function case. – D.Shawley May 02 '09 at 03:43
  • It is implementation defined, but as far as I know the only use for multi-character literals is FourCCs. All the implementations I know of implement FourCC behavior. – bames53 Jul 25 '14 at 21:00
  • `'a'` is a `char` in C++ and an `int` in C. – bames53 Jul 25 '14 at 21:01
1

I see nothing wrong with your algorithm. But for something like this I would just write a function instead of a macro. Macros have a lot of hidden features / problems that can bite you over time.

uint FourCC(char a, char b, char c, char d) { 
  return ( (uint32) (((d)<<24) | ((c)<<16) | ((b)<<8) | (a)) );
}
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • 1
    Functions can't be used as compile-time constants. That means switch statements are not possible with a function. Either a macro or a template class (using a static const int or enum for the result, but not a function) should be fine. – Tom May 01 '09 at 14:53
1

Assuming Windows (as FOURCC is a Windows concept), the Win API already provides mmioStringToFOURCC and mmioFOURCC.

Stu Mackellar
  • 11,510
  • 1
  • 38
  • 59
1

If a compile-time constant isn't required, perhaps the neatest is

unsigned int FourCCStr(const char (&tag)[5])
{
    return (((((tag[3] << 8 ) | tag[2]) << 8) | tag[1]) << 8) | tag[0];
}

#define FOURCC(tag) FourCCStr(#tag)

unsigned int id(FOURCC(blah));

This only accepts tags of four characters, as required.

James Hopkin
  • 13,797
  • 1
  • 42
  • 71
0

How about:

#if BYTE_ORDER == BIG_ENDIAN
#define FOURCC(c0,c1,c2,c3) ((uint32) ((((uint32)((uint8)(c0)))<<24) +(((uint32)((uint8)(c1)))<<16)+ (((uint32)((uint8)(c2)))<<8) + ((((uint32)((uint8)(c3)))))) 
#else
#if BYTE_ORDER == LITTLE_ENDIAN
#define FOURCC(c3,c2,c1,c0) ((uint32) ((((uint32)((uint8)(c0)))<<24) +(((uint32)((uint8)(c1)))<<16)+ (((uint32)((uint8)(c2)))<<8) + ((((uint32)((uint8)(c3)))))) 
#else
#error BYTE_ORDER not defined
#endif
#endif
H_squared
  • 1,251
  • 2
  • 15
  • 32
0

Nowadays there is a better solution using constexpr and c++17 (maybe earlier, not sure). I'm not sure if its fully cross platform but it works on Visual Studio and XCode.

First, you need a wrapper function to convert functions to compile time values:

template <class TYPE, TYPE VALUE> constexpr TYPE CompileTimeValue() { return VALUE; }

Then you need an constexpr function to convert a short string to a integer:

template <class UINT, UInt32 IS_LITTLE_ENDIAN> constexpr UINT MakeCCFromNullTerminatedString(const char * string)
{
    UINT cc = 0;

    UINT shift = 1;

    if (IS_LITTLE_ENDIAN)
    {
        shift = (sizeof(UINT) == 8) ? (0xFFFFFFFFFFFFFFull + 1) : 0xFFFFFF + 1;
    }

    while (UINT c = *string++)
    {
        c *= shift;

        cc |= c;

        if (IS_LITTLE_ENDIAN)
        {
            shift /= 256;
        }
        else
        {
            shift *= 256;
        }
    }

    return cc;
}

Then wrap in macros, to have both 4 byte and 8 byte character constants, with little and big endian variants (if you want)...

#define ID32(x) CompileTimeValue<UInt32,MakeCCFromNullTerminatedString<UInt32,0>(x)>() 
#define ID64(x) CompileTimeValue<UInt64,MakeCCFromNullTerminatedString<UInt64,0>(x)>()
#define CC32(x) CompileTimeValue<UInt32,MakeCCFromNullTerminatedString<UInt32,1>(x)>()
#define CC64(x) CompileTimeValue<UInt64,MakeCCFromNullTerminatedString<UInt64,1>(x)>()

Some tests to verify..

ASSERT(CC32("test") == 'test');

UInt32 v = CC32("fun");

UInt32 test;

switch (v)
{
case CC32("fun"):
    test = 1;
    break;

case CC32("with"):
    test = 2;
    break;

case CC32("4ccs"):
    test = 3;
    break;
}

Bounds overrun checking is not done, probably could be added with compile time assertions though.

James
  • 21
  • 3
0

Rather than a #define, I'd probably put pretty much the same code and rely on the compiler to inline it.

Richard
  • 1,169
  • 6
  • 8
-1
uint32 fcc(char * a)
{   
    if( strlen(a) != 4)
        return 0;       //Unknown or unspecified format

    return 
    (
            (uint32) 
            ( 
                ((*(a+3))<<24) |
                ((*(a+2))<<16) |
                ((*(a+1))<<8) | 
                (*a)
            )       
    );
}
plan9assembler
  • 2,862
  • 1
  • 24
  • 13