Some C++ converted to C, still follows the Standard?

Question

Alright, so the Stockfish project has this chunk of code, which allows storing two ints (which are expected to between ~[-16000, 16000]

EDIT: I forgot the mention, the whole purpose of this is so we can add together two ints at a time. IE

a1+a2,b1+b2 = Decode(Encode(a1,b1) + Encode(a2,b2))

constexpr int make_score(int mg, int eg) {
  return (int)((unsigned int)eg << 16) + mg;
}

inline int eg_value(int s) {
  union { uint16_t u; int16_t s; } eg = { uint16_t(unsigned(s + 0x8000) >> 16) };
  return (int)eg.s;
}

inline int mg_value(int s) {
  union { uint16_t u; int16_t s; } mg = { uint16_t(unsigned(s)) };
  return (int)mg.s;
}

I would like to convert this to C. Obviously I can skip out on the inline and constexpr qualifiers. However, inline union definitions are not allowed. Additionally, I would prefer not to use a union, as it seems out of place IMO.

Here is what I have in C

#define MakeScore(mg, eg) ((int)((unsigned int)(eg) << 16) + (mg))

#define ScoreMG(s) ((int16_t)((uint16_t)((unsigned)((s)))))

#define ScoreEG(s) ((int16_t)((uint16_t)((unsigned)((s) + 0x8000) >> 16)))

As far as I can tell from my testing, these two versions behave the same, when compared between a c++ and a c program. The main difference is I have replace the final step of mg.s or eg.s (getting the signed part of the union), with a simple cast to int16_t.

Any thoughts, or direction to part of any C standard (using C98, but any version is likely the same here) would be appreciated.

My question is, does the version in C conform to the standard. I know the first version does, but is there anything happening with unions that is non obvious here? — AndrewGrant, Apr 01 '18 at 06:43
Your question is a bit broad. The `union` can be useful. Both values are 2-bytes, but when referring to the `uint16_t` value, you will not experience *sign-extension* if manipulating a value with the most-significant-bit `1`. The right shift by `16` is odd as it will always result in zero if your values for `s` are as you indicate, since you will be shifting an `unsigned` value in the range of `uint16_t` by `16` which won't leave many useful bits left over -- unless that is the point... — David C. Rankin, Apr 01 '18 at 06:46

score 2 · Accepted Answer · answered Apr 01 '18 at 07:49

2

The code can be simplified in C++ which is then valid C:

int eg_value2(int s) {
    return ((unsigned)s + 0x8000u) >> 16;
}

It produces the same assembly as the original:

lea eax,[rdi+0x8000]
shr eax,0x10
ret

By the way, the original code wrote to one member of the union then read from another, which is undefined behavior in C++. See Accessing inactive union member and undefined behavior?

answered Apr 01 '18 at 07:49

John Zwinck

239,568
38
324
436

Found the Undefined discussion associated with this https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/cPw8iYwvKM0 – AndrewGrant Apr 02 '18 at 04:33
@AndrewGrant: I wonder if the maintainers would accept a patch with code as in my answer. You're welcome to propose it to them--it certainly seems clearer and safer. – John Zwinck Apr 02 '18 at 12:39
Similar (if not the same) idea has been proposed. End story is that they will always have the undefined behavior when the values stores are outside my range, so why fix some other behavior. – AndrewGrant Apr 02 '18 at 13:09

Some C++ converted to C, still follows the Standard?

1 Answers1