What is the fastest way for bit operations to calculate a parity?

Question

My solution (for every bit of the input block, there is such a line):

*parity ^= (((x[0] >> 30) & 0x00000001) * 0xc3e0d69f);

All types are uint32. This line takes the second bit of the input x, shifts it to the LSB and sets all other bits to zero. Then, the 32-bit parity is XORed with the corresponding parity set for this bit.

I found that this multiplication solution is the fastest way to do this conditional XOR. Is there a faster way?

How big is the block you're trying to calculate the parity for? Wat's the point of multiplying by 0xc3e0d69f? — The Archetypal Paul, Nov 17 '10 at 20:08
I'm talking about C++. The data block is 256 bit (so uint32 x[0..7]. The parity is 32bit (stored in a uint32). For every set bit of the input, a specific XOR-Mask is applied to the parity field (parallel implementation of a LFSR). — vls, Nov 17 '10 at 21:23
Where does that magic number (0xc3E0d69F) come from? From *[Hacker's Delight](https://en.wikipedia.org/wiki/Hacker's_Delight)*? — Peter Mortensen, Aug 26 '22 at 21:34
The more general (and slower) case is *[Count the number of set bits in a 32-bit integer](https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer)*. — Peter Mortensen, Aug 30 '22 at 00:06

score 5 · Answer 1 · edited Aug 29 '22 at 23:03

5

See Compute parity in parallel for some neat hacks for calculating parity of a word, byte, etc.

edited Aug 29 '22 at 23:03

Peter Mortensen

30,738
21
105
131

answered Nov 17 '10 at 20:11

The Archetypal Paul

41,321
20
104
134

Thanks - I already tried out the hints on the page. But especially the "Conditionally set or clear bits without branching" is slower than the multiplication solution. – vls Nov 17 '10 at 21:49
This is a link-only answer. – Peter Mortensen Aug 29 '22 at 23:04
[Another answer](https://stackoverflow.com/questions/21617970/how-can-i-check-if-a-value-has-even-parity-of-bits-or-odd/21618038#21618038) uses something similar. – Peter Mortensen Aug 29 '22 at 23:54

score 3 · Accepted Answer · answered Nov 17 '10 at 21:51

I do not completely understand what kind of parity you mean, but if this line of code is doing that you want, it may be improved.

General rule: for x in {0, 1} x * N == -x & N

this because -x for 0 is all bits reset and for 1 is -1 in which all bits set.

So original line of code may be rewritten as:

*parity ^= (-((x[0] >> 30) & 0x00000001) & 0xc3e0d69f);

What two operations computed in less time than multiplication on many microprocessors, but you should check this.

Also code may take advantage of signed shift right

*parity ^= (((int32_t)x[0] << 1 >> 31) & 0xc3e0d69f);

First shift rshifts 30th bit into 31st, which is sign bit, and then second extend sign bit on all others as shift right on most machines act as floor(x / 2^N), thus fill shifted in bits with sign bit (abc...yz>>3 == aaaabc...yz).

But these tricks are stated as undefined behaviour in C standard and thus not portable. Use them carefully.

score 1 · Answer 3 · answered Nov 17 '10 at 20:34

1

Some processors will do this for you. See x86's parity flag.

answered Nov 17 '10 at 20:34

nmichaels

49,466
12
107
135

It's not just a even/odd parity, it has 32 bits. – vls Nov 17 '10 at 21:19
What do you suggest? Using assembly instead of C? Or some inline assembly? Perhaps expand the answer with a working example solution? – Peter Mortensen Aug 29 '22 at 23:07
I suggest looking at the processor you're targeting and seeing it it has a feature that calculates parity for you. If it does, look at your toolchain to see if it's exposed. Compilers frequently have primitives for single instructions like this. Of course, the usual caveats about premature optimization apply. Doing this the *fastest* way is not always the smartest way. Particularly on code that needs to be portable. – nmichaels Aug 31 '22 at 20:56

score 1 · Answer 4 · edited Aug 30 '22 at 00:04

A parity calculation task is equivalent of counting of ones. Also it called as "count set bits", "population count" or simply popcount. Some of processors have an efficient instruction to calculate it (POPCNT,VCNT).

I will suggest to use the lowest bit of popcount.

It can be accessed by inline assembler or by using builtins:

__builtin_popcount()/ __popcnt()/ std::bitset::count()

for GCC, Visual Studio, and C++.

Personally I prefer to give this job to the compiler by using __builtin_parity().

score 0 · Answer 5 · edited Aug 30 '22 at 00:00

0

If I understand the question correctly, you are doing

for (i = 0; i < 32; i++)
    *parity ^= (((x[0] >> i) & 1) * SOME_CONST[i]);

If so, it's better to use lookup tables:

for (i = 0; i < 4; i++)
    *parity ^= PARITY_LUT[i][ (x[0] >> (i*8)) & 0xFF];

It would cost 256 kilobytes, but it will be much faster.

edited Aug 30 '22 at 00:00

Peter Mortensen

30,738
21
105
131

answered Nov 18 '10 at 00:48

ruslik

14,714
1
39
40

What is the fastest way for bit operations to calculate a parity?

5 Answers5

Linked