0

Good evening fellow coders, I'm a newbie in C++ and after searching for an answer I have not come to find a helpful answer. I'm testing #pragma pack right now and want to compare access speeds with unaligned memory, therefore I have a few structs I want to test elapsed time with.

struct ReallySlowStruct {
    char c : 6;
    __int64 d : 64;
    int b : 32;
    char a : 8;
};

And

int main() {
    struct ReallySlowStruct s;

    std::chrono::steady_clock::time_point begin = std::chrono::high_resolution_clock::now();
    s.a = 'c';
    s.d = 100;
    s.b = 50;
    s.a = 'a';
    std::chrono::steady_clock::time_point end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count() << std::endl;
    system("pause");
    return 0;
}

This is what I found how to measure elapsed time with using the Chrono library. However, the output is always 0 and I don't know why. I've tried the different clocks, as well as different time units, but it's always zero. Can anyone explain what's wrong with it? I'm using VS 2013.

Grougal
  • 17
  • 5
  • The `ReallySlowStruct` is not nearly that slow -- four accesses, even unusually aligned ones, will take much less than 1 ms on any modern computer. – Jeremy Roman Sep 11 '16 at 18:31
  • What compiler flags are you using? I would not be surprised if those assignments either get rollled into the initialization as an optimization, or just optimized away completely since they have no side effects at all; `S` is never used after the assignments, so some compilers will just pretend it is not even there in the first place. – caps Sep 11 '16 at 18:32
  • 4
    How long time do you expect it to take? Hint: Try casting to `std::chrono::nanoseconds`. – Jørgen Sep 11 '16 at 18:32
  • You are mixing two different clocks. – Galik Sep 11 '16 at 18:38
  • Thanks for the answers, `std::chrono::nanoseconds` wields the same results though. I don't know about compiler flags, where can I check them and what do I need to care about? – Grougal Sep 11 '16 at 18:41
  • 1
    Once you fix all the non-standard code, mixing of clocks and the [`system("pause");`](http://stackoverflow.com/questions/1107705/systempause-why-is-it-wrong) you see that the access is properly [optimized away](https://godbolt.org/g/fn1Xs6). – nwp Sep 11 '16 at 18:43
  • @nwp The mixing of clocks was my fault when switching clocks up, I used steady clock on my first few tries. What is there to fix about the system("pause") bit? I want to halt the window so it doesn't close immediately after calculating it. Also, how do I stop the automatic optimization? – Grougal Sep 11 '16 at 18:48
  • A classic strategy for measuring really fast operations is to put them in a loop and do them a million times. In this case, your OS might let you read the hardware TSÇ if `high_resolution_clock` somehow doesn't. But, back-of-the-envelope calculation: how much time do a dozen clock cycles take on a 2GHz CPU? Milliseconds, microseconds or nanoseconds? – Davislor Sep 11 '16 at 18:49
  • 1
    The `system("pause");` above is a link, so you can read why people don't like it. You should also rethink your goal. You now know that the access costs nothing because of the optimizer. If you do tricks to disable the optimizer you are not measuring real code, you are measuring how much artificial slowdown takes, which is not very useful. – nwp Sep 11 '16 at 18:52
  • @nwp In this case, the only reason why it's getting optimized away is because I'm not using the struct, correct? In a realistic scenario I would obviously use its members somewhere, so understanding how alignment affects memory can't be useless, no? That said, I have now used a for loop with 50.000 iterations, and now it either outputs a number of about or exactly "500.100" OR 0. Shouldn't it be lower and higher, but not either 0 or >500.000? – Grougal Sep 11 '16 at 18:58
  • Micro-benchmarking is very tricky. I wouldn't know how to set this up to get a meaningful number. If you have a real program measure over a longer period (like a second) how long it takes with which alignments. Otherwise don't worry about it. If you seriously want to go into micro-benchmarking you will need to learn a lot about compiler optimizations and you should consider using a library such as [nonius](https://github.com/rmartinho/nonius/) which does some of the tricks for you. – nwp Sep 11 '16 at 19:10
  • @nwp Well thank you then. I have been programming more or less actively for 5 years now, mostly in Java, but I've never really tried to expand my knowledge much in all this time, so I'm trying to absorb as much as I can now. For example, I always knew there was some pre-processing going on (the optimizer is some kind of pre-processing to my understanding), but I would never have guessed that he would just remove code like that. Thanks! – Grougal Sep 11 '16 at 19:17
  • Confused: is this structure packed? Because if it isnt, the int64 will be aligned normally. Could you include the pragmas around the struct if so? – kfsone Sep 11 '16 at 19:51
  • Also, you don't need to write `struct ReallySlowStruct` other than in the definition or a forward declaration of the type, that's a C requirement. – kfsone Sep 11 '16 at 19:54
  • @kfsone I'm using `#pragma pack(push,1)` before the definition of the struct and a `#pragma pack(pop)` after, although I'm not sure what the pop in the end does. I do generally understand how the packing functions, though. Also, thanks for the added information about structs. – Grougal Sep 11 '16 at 20:16
  • @Grougal might be helpful to add the pragmas to your post – kfsone Sep 11 '16 at 20:19
  • @kfsone I have written that I'm testing packed structures and my problem was more about the measurement of time than about the packing itself, so I didn't include it. – Grougal Sep 11 '16 at 20:26

0 Answers0