static array of structs, why does calling bzero result in improved performance?

Question

I am implementing an order book in c++. I'm modelling it after this implementation which has won awards for speed.

The reason it is fast is explained as using static arrays so that the entire order book is allocated on the stack instead of the heap.

What surprised me is that there is a significant performance gain from initializing the arrays with bzero. Below is init function in the linked c implementation. Commenting out the bzero calls results in roughly 2-3x slower performance when I fed it sample data and measured the results.

void init() {
  /* Initialize the price point array */
  bzero(pricePoints, (MAX_PRICE + 1) * sizeof(pricePoint_t));  

  /* Initialize the memory arena */
  bzero(arenaBookEntries, MAX_NUM_ORDERS * sizeof(orderBookEntry_t));

  ...
}

In my C++ code, my static arrays were written like below. I was under the impression in C++ 11 that setting an array = {} would initialize it with zero values. Hence I didn't think I would need to use bzero. Testing against sample data indicated my implementation was equivalent to the c implementation without bzero. Adding the bzero call to the constructor of my OrderBook class, resulted in performance equivalent to the C implementation.

So my question is why did using bzero result in faster performance?

// header file
OrderBook {
   static PricePoint _price_points[MAX_PRICE / TICK_SIZE];
   static Order _orders[MAX_ORDERS]; 
}

//cpp file
PricePoint OrderBook::_price_points[MAX_PRICE / TICK_SIZE] = {}
Order OrderBook::_orders[MAX_ORDERS] = {};

Just out of curiosity , have you tried memset as memset is standard, bzero isn't. — Abhishek Chandel, Oct 31 '19 at 04:52
Did you add the time spent in `bzero` to your calculations? I suspect that the total time is actually the same if you include the `bzero` time. The first access to `_price_points` and `_orders` will incur a page fault, which takes time. If you `bzero` the memory, then you are front-loading the cost. If you don't `bzero` memory, then you defer the cost. The cost is the same; it's just a question of when you pay it. — Raymond Chen, Oct 31 '19 at 04:57
My measurements were specifically for the time it took for orders to be placed, not initialization. Specifically I was wrapping the `limit` and `cancel` functions to determine speed of those methods. Note my measurements are over 260,000 calls of these methods. Could the initial page fault really explain the cost of the average speed being 2-3x slower? (It might but i'm not sure)... Also possibly dumb question, but does a pagefault need to happen for variables allocated on the stack? — Terence Chow, Oct 31 '19 at 05:23
why not start bench-marking your performance after first k calls of `limit` and `cancel` functions? @TerenceChow — v78, Oct 31 '19 at 05:37
Yes stack allocations are normal memory and require a page fault when first used. Static variables aren't on the stack normally https://stackoverflow.com/questions/93039/where-are-static-variables-stored-in-c-and-c — Alan Birtles, Oct 31 '19 at 07:25

static array of structs, why does calling bzero result in improved performance?

0 Answers0