I need to analyse a group of clients, say I've got 2783 clients. I've got a code R written for a generic client, and I've got all the data that the program needs to calculate different variables in a database linked to the workspace. The code must be run sequentially since there are many dependent variables that build on each other. Each run of a client takes about 1 minute to run. I know I've got 8 logical processors in my computer and R only uses 1 unless run in parallel.
The issue I haven't found an answer in the internet yet, is that I need to send via a batch file: client 1 to the first processor.... client 8 to the eighth processor... and only when one processor is done, write a log file with some specifics about the run itself and move on to the next client, say processor 1 when finishes with client 1, move on to client 9 (since the other 7 processors have started with the remaining first 7 clients on the batch list).
A given processor must start and finish a client that has picked up because of what I mentioned. And each week I have a similar amount of clients to analyse.
So it would be a problem of batch processing R code and in parallel to maximise the computer's processing power.
At this rate, to run all the client extract of about 2800 people I'd need almost 2 days working around the clock! Just using the 8 cores would reduce this amount of time by around 88% to approximately 6 hours, and in batch processing, even if it takes 6 hours, they would be 6 hours in which I can focus on doing other work.
Thanks in advance!