I have an embarrassingly parallel job that requires no communication between the workers. I'm trying to use the dfeval function, but the overhead seems to be enormous. To get started, I'm trying to run the example from the documentation.
>> matlabpool open
Starting matlabpool using the 'local' configuration ... connected to 8 labs.
>> sched = findResource('scheduler','type','local')
sched =
Local Scheduler Information
===========================
Type : local
ClusterOsType : pc
ClusterSize : 8
DataLocation : C:\Users\~\AppData\Roaming\MathWorks\MATLAB\local_scheduler_data\R2010a
HasSharedFilesystem : true
- Assigned Jobs
Number Pending : 0
Number Queued : 0
Number Running : 1
Number Finished : 8
- Local Specific Properties
ClusterMatlabRoot : C:\Program Files\MATLAB\R2010a
>> matlabpool close force local
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
>> sched = findResource('scheduler','type','local')
sched =
Local Scheduler Information
===========================
Type : local
ClusterOsType : pc
ClusterSize : 8
DataLocation : C:\Users\~\AppData\Roaming\MathWorks\MATLAB\local_scheduler_data\R2010a
HasSharedFilesystem : true
- Assigned Jobs
Number Pending : 0
Number Queued : 0
Number Running : 0
Number Finished : 8
- Local Specific Properties
ClusterMatlabRoot : C:\Program Files\MATLAB\R2010a
>> tic;y = dfeval(@rand,{1 2 3},'Configuration', 'local');toc
Elapsed time is 4.442944 seconds.
Running subsequent times produces similar timings. So my questions are:
- Why do I need to run matlabpool close force local to get the Number Running to zero, given that I run matlabpool open in a fresh instance?
- Is five seconds of overhead really necessary for such a trivial example? especially given the Matlab workers have already been started up?