0

I am using Matlab R2010b as a driver for a bunch of external tools. Basically Matlab does some data transformations, writes stuff to disk, calls external scripts using system and so on and so forth.

I have never had any problems with batch computations taking several days in Matlab but apparently I am doing something wrong now. Every now and then, at irregular intervals, the pipeline completely jams. Nothing happens, no external scripts are being called, love's labour's lost. Pressing ctrl+c in Matlab gets things moving again. Because of this I assume the problem is Matlab.

There are no pauses in the pipeline nor am I using any variant of sleep sort. The holdups occur at completely random places in the code and seem to start occurring after at least one hour of working perfectly. Since the Matlab code is basically a giant loop, it's a complete mystery to me (logic posits that if the body of a loop works once it should keep working).

It doesn't seem to have anything to do with power management (and like I said, I've done numerous long-time computations successfully in the past). This is why I assume it has something to do with the combination of Matlab + scripts.

Has anyone experienced something remotely similar (and, hopefully, been able to solve it)?

Marc Claesen
  • 16,778
  • 6
  • 27
  • 62
  • Well, reading your description I wouldn't blame Matlab for this. It is rather some problem with Matlab process in system (it is waiting for some resources or something like that). Is there anything unusual in Events in system? Did you test system for viruses, test memory, test disk? Are you able to put whole system on another machine? – hesar Jan 22 '14 at 21:34
  • @hesar I can't spot any unusual events. I've got plenty of free memory/HDD space and when the pipeline stalls CPU usage drops to idle levels on all cores (e.g. there are no competing processes). No viruses are found either. I assume memory and disk are fine since similar processes where I use something other than Matlab as a driver work fine (Python). – Marc Claesen Jan 22 '14 at 21:38
  • Is there a java leak somewhere? I.e. does the problem go away/take longer to appear when you increase Java Heap Memory? – Jonas Jan 22 '14 at 22:14
  • @Jonas good idea! I'll try that and see what happens. – Marc Claesen Jan 22 '14 at 22:17
  • What OS is this? Is the computer on a network? What else is running? If it happens again you should check Matlab's memory and CPU usage (presumably these shouldn't change much as your script runs). And of course "logic posits that if the body of a loop works once it should keep working" is far from true when it come to real software on real hardware in the real world. – horchler Jan 22 '14 at 23:12
  • @horchler it's on Debian Wheezy, no network and no other processes. On jams matlab's CPU usage drops to zero and no scripts are running. – Marc Claesen Jan 23 '14 at 08:57
  • @MarcClaesen: Good, at least it's not Windows! Have you tried changing the process priority using [`nice`/`renice`](http://linux.101hacks.com/monitoring-performance/hack-100-nice-command-examples/)? Remember, low (negative) niceness values are probably what you want. And [this question](http://stackoverflow.com/questions/5718567/niceness-and-priority-processes-on-linux-system) might also be informative. Can you run your code with no GUI from the command line, i.e., with the `-nojvm` option? That might rule out some Java-realted issues. – horchler Jan 23 '14 at 16:15
  • @horchler matlab always runs on `nice` level -3 for me, so it's not a priority issue. If it were, I would assume it would also present itself when running long matlab-only computations (which work perfectly fine). – Marc Claesen Jan 23 '14 at 16:18

1 Answers1

0

You can keep your Matlab busy with smaller jobs in betweens the larger one. This may reveal something about your system. Start with lots of smaller jobs then smaller number of larger jobs, get some sort of bottleneck threshold ...

bhamadicharef
  • 360
  • 1
  • 11