-1

I want to run for loop in parallel threads.

My dataset is from the stock market and is of the form:
Type ID Token Buy/Sell Price Quantity

Type is:
N for New Order
M for Modifying an Order
X for cancelling an order
T for an order that has being traded

ID is a sixteen digit unique number corresponding to a particular order.It is generated when a New Order comes. It is required when an order is being Modified, Cancelled or Traded.

Token corresponds to different companies in the stock market. They are 5 digit numbers.

A Trade message is a bit different. Its of the form:
Type BuyOrderID SellOrderID Token Price Quantity

The four types are as follows:

N   1200000000006773    48256    B  13595   4000
M   1200000000006773    48256    B  13585   4000
X   1200000000006649    48331    B  70125   500
T   1200000000009326    1200000000007756    48321   7275    8000

Now I want to parse through each row, store the tokens in a hashtable, and go on with the corresponding actions.

I have more than a billion rows, so I need parallel processing for faster results. How to change the for loop (no matter using what) to make use of parallel processing?

Thanks in advance.

  • 2
    Please provide a [minimal reproducible and complete example](http://stackoverflow.com/a/5963610/1412059). You need to seriously investigate if you can't avoid an R-level loop. I seriously doubt that parallelization alone (even using a large number of CPUs) will help with your performance problems. – Roland May 30 '16 at 07:08

1 Answers1

2

You could use the parallel library:

library(parallel)
cores.number <- detectCores()
cl <- makeCluster(cores.number, type = "FORK")
clusterSetRNGStream(cl, iseed = seed)

And then use parLapply, parSapply etc like this:

parSapply(cl, X, function)

Check the package documentation: https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf

Alejandro Alcalde
  • 5,990
  • 6
  • 39
  • 79
  • 2
    The `parallel` package now belongs to the R base library. Provided that the installed `R` version is up-to-date, the most recent documentation is available with `vignette("parallel","parallel")`. – RHertel May 30 '16 at 07:24