1

I was checking out a session by Daniel Hanson from CppCon2019. It is about applying modern C++ in quantitative finance. One of the key ideas is to run the equity price generator asynchronously so that the Monte-Carlo process can be greatly accelerated.

This function in particular caught my attention. It works as intended in Visual Studio but fails to run asynchronously on my Linux machine.

void MCEuroOptPricer::computePriceAsync_() {
  // a callable object that returns a `std::vector<double>` when called
  EquityPriceGenerator epg(spot_, numTimeSteps_, timeToExpiry_, riskFreeRate_,
                           volatility_);
  // fills `seeds_` (a std::vector<int>) with `std::iota`
  generateSeeds_();

  std::vector<std::future<std::vector<double>>> futures;
  futures.reserve(numScenarios_);
  for (auto &seed : seeds_) {
    futures.push_back(std::async(epg, seed));
  }

  std::vector<double> discountedPayoffs;
  discountedPayoffs.reserve(numScenarios_);
  for (auto &future : futures) {
    double terminalPrice = future.get().back();
    double payoff = payoff_(terminalPrice);
    discountedPayoffs.push_back(discFactor_ * payoff);
  }

  double numScens = static_cast<double>(numScenarios_);
  price_ =
      quantity_ * (1.0 / numScens) *
      std::accumulate(discountedPayoffs.begin(), discountedPayoffs.end(), 0.0);
}

I was using clang++ -std=c++17 -O3. This paralleled version runs even slower than the not-paralleled version. It was not using multiple cores according to htop. I tried to call std::async with std::launch::async but it did not help either. Is it because I am missing some compiler options or Visual Studio's compiler is applying some optimization that I am unaware of? How can I make this function run asynchronously on Linux?

Not a CS major so I might just be missing something obvious. Any help is greatly appreciated.

UPDATE: it turns out that currently std::async is pooled on Windows, but not on UNIX-like systems. This article by Dmitry Danilov explains this in detail.

I managed to get similar performance on WSL2 as native Windows with an implementation involving boost/asio/thread_pool.hpp.

void MCEuroOptPricer::computePriceWithPool_() {
  EquityPriceGenerator epg(spot_, numTimeSteps_, timeToExpiry_, riskFreeRate_,
                           volatility_);
  generateSeeds_();

  std::vector<double> discountedPayoffs;
  discountedPayoffs.reserve(numScenarios_);

  std::mutex mtx; // avoid data races when writing into the vector
  boost::asio::thread_pool pool(get_nprocs());
  for (auto &seed : seeds_) {
    boost::asio::post(pool, [&]() {
      double terminalPrice = (epg(seed)).back();
      double payoff = payoff_(terminalPrice);
      mtx.lock();
      discountedPayoffs.push_back(discFactor_ * payoff);
      mtx.unlock();
    });
  }
  pool.join();

  double numScens = static_cast<double>(numScenarios_);
  price_ =
      quantity_ * (1.0 / numScens) *
      std::accumulate(discountedPayoffs.begin(), discountedPayoffs.end(), 0.0);
}
Calvin Yao
  • 31
  • 2
  • Might be time to take a look at g++ ;) Relevant links: https://stackoverflow.com/a/24069313/421195, https://reviews.llvm.org/D52193 – paulsm4 Jul 29 '20 at 04:04
  • 1
    @paulsm4 Thank you for the comment! After some digging, it seems that somehow std::async uses thread pool ONLY on Windows. g++ and clang++ on WSL, clang++ on macOS, they both give me a slower paralleled version. On the other hand clang++ and VC++ on Windows are working as intended. Also if I force to run the generator with launch::async, the program will throw a system_error when I supply a relatively large numScene. I believe I need to read much more about multithreading before I could get it to work on Linux. – Calvin Yao Jul 31 '20 at 18:03

0 Answers0