2

I have read through this SO question and answers (R parallel computing and zombie processes) but it doesn't seem to quite address my situation.

I have a 4-core MacBook Pro running Mac OS X 10.10.3, R 3.2.0, and RStudio 0.99.441.

Yesterday, I was trying out the packages "foreach" and "doParallel" (I want to use them in a package I am working on). I did this:

cl <- makeCluster(14)
registerDoParallel(cl)

a <- 0
ls <- foreach(icount(100)) %dopar% {
    b <- a + 1
}

It is clear to me that it doesn't make sense to have 14 processes on my 4-core machine, but the software will actually be run on a 16-core machine. At this point my computer ground to a halt. I opened activity monitor and found 16 (or more, maybe?) R processes. I tried to force quit them from the activity monitor -- no luck. I closed RStudio and that killed all the R processes. I reopened RStudio and that restarted all the R processes. I restarted the computer and restarted RStudio and that restarted all the R processes.

How can I start RStudio without restarting all those processes?

EDIT: I forgot to mention that I also rebuilt the package I was working on at the time (all the processes may have been running during the build)

EDIT2: Also, I can't StopCluster(cl) because cl is not in the environment anymore...I closed that R session.

EDIT3: When I open R.app (The R GUI provided with R) or open R in the terminal, no such problem occurs. So I think it must be RStudio-related.

EDIT4: There appears to be a random delay between opening RStudio and the starting of all these undesired processes. Between 15s and 2 mins.

EDIT5: It seems the processes only start after I open the project from which they were started.

EDIT6: I have been picking through the .Rproj.user files looking for things to delete. I deleted all the files (but not the directories) in ctx, pcs, and sdb. Problem persists.

EDIT7: When I run "killall R" at the command line it kills all these processes, but when I restart RStudio and reopen the project, all the processes start again.

EDIT8: I used "killall -s R | wc -l" to find that the number of R processes grows and grows while the project is open. It got up to 358 and then I ran "killall R" because my computer was making scary sounds.

EDIT9: RStudio is completely unusable currently. Every time I "killall R", it restarts all the processes within 15 seconds.

EDIT10: When I initiate a build that also starts up tons of R processes -- 109 at last check. These processes all get started up when the build says "preparing package for lazy loading". At this point the computer grinds to a near-halt.

EDIT11: I deleted the .Rproj file (actually just moved it as a backup) and the .Rproj.user directory. I used "create project from directory" in RStudio. When I open that NEW project, I still get the same behavior. What is RStudio doing when I open a project that isn't contained anywhere in the .Rproj file or the .Rproj.user directory!? I've spent the whole day on this one problem....:(

Community
  • 1
  • 1
rcorty
  • 1,140
  • 1
  • 10
  • 28
  • See here for resetting rstudio to a fresh state: https://support.rstudio.com/hc/en-us/articles/200534577-Resetting-RStudio-s-State – Ritchie Sacramento Jun 04 '15 at 17:10
  • Jay, thanks for your suggestion. I did what was described on that page and it definitely reset my RStudio settings (layout, appearance, etc.), but when I start RStudio, it still starts all the unwanted processes. – rcorty Jun 04 '15 at 17:12
  • And I'll just add -- I think your suggestion could be an answer (rather than a comment). Just because it's easy/short doesn't mean it couldn't have been the completely correct and most useful response! – rcorty Jun 04 '15 at 17:13
  • I have just run this on my macbook pro with `makeCluster(32)` and (i) my computer did not freeze (in fact the code finished pretty much instantaneously) and (ii) after restarting RStudio the R processes did not reappear. – Roland Jun 04 '15 at 19:20
  • @Roland, maybe it has something to do with building the package? – rcorty Jun 04 '15 at 19:25

3 Answers3

0

Best guess -- the newest version of RStudio tries to do some work behind the scenes to develop an autocompletion database, based on library() and require() calls it detects within open files in your project. To do this, it launches new R processes, loads those packages (with library()), and then returns the set of all objects made available by that package.

By any chance are you loading certain packages that have complex .onLoad() actions? It's possible that this engine in RStudio is running those in R processes behind the scenes, but getting stuck for some reason and leaving you with these (maybe stale or busy) R processes.

For reference, there was somewhat similar issue reported here.

Kevin Ushey
  • 20,530
  • 5
  • 56
  • 88
  • Kevin -- thanks for the insight. I bet it is something along these lines. But the package I'm working on has lots of dependencies, so maybe the best solution is just to go back to a previous version of RStudio? I do love the code completion in the new RStudio, though. – rcorty Jun 05 '15 at 14:20
  • If you happen to narrow down the problem, please submit a bug report at http://support.rstudio.com. Thanks! – Kevin Ushey Jun 05 '15 at 18:18
0

Here's what ended up fixing it:

Delete the package I built (the binary, I believe...I clicked the "x" to the right of it's name in the "Packages" part of RStudio).

Rebuild it, with

library(parallel)

commented out.

rcorty
  • 1,140
  • 1
  • 10
  • 28
  • 1
    Why do you have a `library` call in your package? State dependencies in your DESCRIPTION file. – Roland Jun 05 '15 at 21:38
  • 1
    Totally fair point. As described in the original question, I was just playing around with the parallel package to see how it worked. I had a file called "temp.R" where I was doing some "hello world" type stuff. Live and learn. Maybe the trick is to call that file "temp.txt" or anything other than ".R" so that, should I want to rebuild while the file exists, it gets ignored. OR, just add that file to .buildignore! – rcorty Jun 05 '15 at 22:25
0
unloadNamespace("doParallel")

will kill the unnamed worker started by registerDoParallel

if you have the name of the clusters, you can use:

stopCluster(cl)
cloudscomputes
  • 1,278
  • 13
  • 19