0

I have a pointer to a char array, and I need to go along and XOR each byte with a 64 bit mask. I thought the easiest way to do this would be to read each 8 bytes as one long long or uint64_t and XOR with that, but I'm unsure how. Maybe casting to a long long* and dereferencing? I'm still quite unsure about pointers in general, so any example code would be much appreciated as well. Thanks!

EDIT: Example code (just to show what I want, I know it doesn't work):

void encrypt(char* in, uint64_t len, uint64_t key) {
        for (int i = 0; i < (len>>3); i++) {
            (uint64_t*)in ^= key;
            in += 8;
        }
    }
}
Nabushika
  • 364
  • 1
  • 17
  • 1
    You'll need to make sure your char[] is properly aligned if you want to take that route. Safer to work one byte at a time until performance demands otherwise. – ildjarn Jul 17 '16 at 08:46
  • Please add at least a [minimal](http://stackoverflow.com/help/mcve) example code to show exactly what would you like to achieve. Why do you need to XOR pointer bytes anyway? – Jezor Jul 17 '16 at 08:47
  • @ildjarn I'm gonna be doing this to several megabytes, so performance is crucial. :/ – Nabushika Jul 17 '16 at 08:47
  • 1
    @TheAbelo2 : Until you measure and find performance of the easiest _correct_ solution to be insufficient, performance is _not_ crucial. – ildjarn Jul 17 '16 at 08:48
  • 1
    `&in ^= key;` you probably mean `*in ^= key;` – n. m. could be an AI Jul 17 '16 at 09:04
  • It is also unclear why your mask needs to be 64 bit. You are only using the low char-worth bits of it. – n. m. could be an AI Jul 17 '16 at 09:05
  • @n.m. I SAID my example code didn't work... you'll also notice I was incrementing the pointer by 8 bytes each time. – Nabushika Jul 17 '16 at 09:08
  • 1
    One of the reasons that it doesn't work is that you're XORing against the address of `in` rather than its content. `(long *)in ^= key;` – David Hoelzer Jul 17 '16 at 09:24
  • 1
    I know of three "obviously correct" ways for a 64-bit integer to be laid out as a sequence of eight bytes, and I'm pretty sure I've seen all three on ordinary desktop computers -- you *really* shouldn't rely on the contents of a 64-bit integer being laid out in any particular way. –  Jul 17 '16 at 09:48
  • @Hurkyl A good point! – anatolyg Jul 17 '16 at 10:06
  • @TheAbelo2 "several megabytes" is trivial for a modern computer to process. Using non-portable assumptions about byte layout and/or unaligned access will either gain nothing, spuriously shuffle the data, make things slower, or generate a trap and crash your program. Why bother? – underscore_d Jul 17 '16 at 10:07
  • **Anyway**, by casting a pointer to an incompatible type and dereferencing it, this code violates strict aliasing. So, the optimiser is free to trash it - or worse. You can only cast _to_ `char *` and retain defined behaviour. – underscore_d Jul 17 '16 at 10:11

1 Answers1

3

The straightforward way to do your XOR-masking is by bytes:

void encrypt(uint8_t* in, size_t len, const uint8_t key[8])
{
    for (size_t i = 0; i < len; i++) {
        in[i] ^= key[i % 8];
    }
}

Note: here the key is an array of 8 bytes, not a 64-bit number. This code is straightforward - no tricks needed, easy to debug. Measure its performance, and be done with it if the performance is good enough.

Some (most?) compilers optimize such simple code by vectorizing it. That is, all the details (casting to uint64_t and such) are performed by the compiler. However, if you try to be "clever" in your code, you may inadvertently prevent the compiler from doing the optimization. So try to write simple code.

P.S. You should probably also use the restrict keyword, which is currently non-standard, but may be required for best performance. I have no experience with using it, so didn't add it to my example.


If you have a bad compiler, cannot enable the vectorization option, or just want to play around, you can use this version with casting:

void encrypt(uint8_t* in, size_t len, uint64_t key)
{
    uint64_t* in64 = reinterpret_cast<uint64_t*>(in);
    for (size_t i = 0; i < len / 8; i++) {
        in64[i] ^= key;
    }
}

It has some limitations:

  • Requires the length to be divisible by 8
  • Requires the processor to support unaligned pointers (not sure about x86 - will probably work)
  • Compiler may refuse to vectorize this one, leading to worse performance
  • As noted by Hurkyl, the order of the 8 bytes in the mask is not clear (on x86, little-endian, the least significant byte will mask the first byte of the input array)
Community
  • 1
  • 1
anatolyg
  • 26,506
  • 9
  • 60
  • 134
  • What is size_t? I've never come across it before. – Nabushika Jul 17 '16 at 09:01
  • `size_t` is the proper type for sizes; explained [here](http://en.cppreference.com/w/cpp/types/size_t) – anatolyg Jul 17 '16 at 09:11
  • 1
    `std::uint8_t` is not a valid type for reading object representation as it is under no obligation to be a `char` type, which are the only ones that have that allowance, so using it is non-portable and risks UB. Then your 2nd half is very poor advice and a massively efficient generator of UB, as casting _from_ `char *` violates strict aliasing (there's no symmetry with the allowance for casting _to_ `char *`) – underscore_d Jul 17 '16 at 10:15
  • 1
    Your last suggestion is a strict aliasing violation, so it also requires the compiler to support a "no strict aliasing" dialect – M.M Jul 17 '16 at 10:16
  • @underscore_d You are welcome to write a version that doesn't violate strict aliasing and does work with 64-bit types. I have no idea how to write it, and I suspect it's impossible. Please correct me if I am wrong. – anatolyg Jul 17 '16 at 10:24
  • 1
    @anatolyg The portable method is to consider each `char` independently, which you _almost_ got in your 1st section but lost by using `std::uint8_t`, which isn't guaranteed to be a typedef to a `char` type (I view the fact that it _usually_ is as immaterial). If by "work with 64-bit types" you mean process more than one `char` at a time through use of a wider type, then no, I don't believe there's any portable way to achieve that - you become dependent on one platform's representation of a 64-bit `int`, & worse, it falls victim to strict aliasing - hence I felt it worthwhile to make that clear. – underscore_d Jul 17 '16 at 10:30
  • @underscore_d : It can be done by memcpying each successive 8 bytes into a local uint64, then updating the local and memcpying it back. Portable, and easily vectorizable. – ildjarn Jul 17 '16 at 11:42
  • @ildjarn Good point. But does that practically depend on the assumption/hope - which is presumably why I didn't think of it ;) - that said `memcpy`s will be optimised into direct reads/writes? I've seen this happen. Definitely, `memcpy` is 'the' way to do this sort of thing (if you really have to!) without incurring the Wrath of Standard and _in the right conditions_ with the same resulting code anyway. But if the `memcpy`s were not optimised away, do you think there would be any performance gain over just processing `char`-by-`char`? I don't know which directionality to expect wrt performance – underscore_d Jul 17 '16 at 11:51
  • 1
    @underscore_d : "*But if the `memcpy`s were not optimised away, do you think there would be any performance gain over just processing `char`-by-`char`?*" If you use the appropriate block sizes (i.e. multiples of `8`) and the whole thing gets vectorized, then absolutely. Personally, I wouldn't bother with anything other than char-by-char unless my implementation used manual SIMD intrinsics since it's not worth the bother if the compiler decides to not autovectorize. – ildjarn Jul 17 '16 at 12:40