Questions tagged [snowfall]

Usability wrapper around snow for easier development of parallel R programs.

66 questions
105
votes
1 answer

How to setup workers for parallel processing in R using snowfall and multiple Windows nodes?

I’ve successfully used snowfall to setup a cluster on a single server with 16 processors. require(snowfall) if (sfIsRunning() == TRUE) sfStop() number.of.cpus <- 15 sfInit(parallel = TRUE, cpus = number.of.cpus) stopifnot( sfCpus() ==…
jclouse
  • 2,289
  • 1
  • 20
  • 25
14
votes
1 answer

R connecting to EC2 instance for parallel processing

I am having trouble initialising a connection to an AWS EC2 instance from R as I seem to keep getting the error: Permission denied (publickey) I am currently using a Mac OS X 10.6.8 as my OS The code that I try to run in the terminal ($) and then R…
h.l.m
  • 13,015
  • 22
  • 82
  • 169
9
votes
1 answer

How to log using futile logger from within a parallel method in R?

I am using futile logger in R for logging. I have a parallel algorithm implemented using snowfall in R. Each core of the parallel process logs an intermediate output in the logger. But this output is not showing up in the logger? Can we log using…
user1971988
  • 845
  • 7
  • 22
8
votes
1 answer

How to output a message in snowfall?

I am conducting a simulation study using snowfall package on Windows 7. I like to print out a message for every 10 runs to main R console to monitor the progress, but it fails to do so. ie. nothing is printed Any help will be much…
Tony
  • 2,889
  • 8
  • 41
  • 45
8
votes
2 answers

Importing snowfall into custom R package

I'm developing an R package which needs to use parallelisation as made available by the snowfall package. snowfall doesn't seem to import the same was as other packages like ggplot2, data.table, etc. I've included snowfall, rlecuyer, and snow in the…
TheComeOnMan
  • 12,535
  • 8
  • 39
  • 54
7
votes
2 answers

Using snow (and snowfall) with AWS for parallel processing in R

In relation to my earlier similar SO question , I tried using snow/snowfall on AWS for parallel computing. What I did was: In the sfInit() function, I provided the public DNS to socketHosts parameter like so sfInit(parallel=TRUE,socketHosts…
7
votes
1 answer

Initializing MPI cluster with snowfall R

I've been trying to run Rmpi and snowfall on my university's clusters but for some reason no matter how many compute nodes I get allocated, my snowfall initialization keeps running on only one node. Here's how I'm initializing…
6
votes
3 answers

writing to global environment when running in parallel

I have a data.frame of cells, values and coordinates. It resides in the global environment. > head(cont.values) cell value x y 1 11117 NA -34 322 2 11118 NA -30 322 3 11119 NA -26 322 4 11120 NA -22 322 5 11121 NA -18 322 6…
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
6
votes
2 answers

Fast correlation in R using C and parallelization

My project for today was to write a fast correlation routine in R using the basic skillset I have. I have to find the correlation between almost 400 variables each having almost a million observations (i.e. a matrix of size p=1MM rows & n=400…
user1971988
  • 845
  • 7
  • 22
5
votes
1 answer

Why not load balance when parallel computing using snowfall?

For a long time I have been using sfLapply for a lot of my parallel r scripts. However, recently as I have delved more into parallel computing, I have been using sfClusterApplyLB, which can save a lot of time if individual instances do not take the…
Lucas Fortini
  • 2,420
  • 15
  • 26
5
votes
3 answers

How to calculate number of occurrences per minute for a large dataset

I have a dataset with 500k appointments lasting between 5 and 60 minutes. tdata <- structure(list(Start = structure(c(1325493000, 1325493600, 1325494200, 1325494800, 1325494800, 1325495400, 1325495400, 1325496000, 1325496000, 1325496600,…
TimV
  • 53
  • 5
5
votes
2 answers

Communication of parallel processes: what are my options?

I'm trying to dig a bit deeper into parallelziation of R routines. What are my options with respect to the communication of a bunch of "worker" processes regarding the communication between the respective workers? the communication of the workers…
Rappster
  • 12,762
  • 7
  • 71
  • 120
4
votes
2 answers

How to manage parallel processing with animated ggplot2-plot?

I'm trying to build an animated barplot with ggplot2 and magick that's growing on a "day per day" base. Unfortunately, I've got tenthousands of entries in my dataset (dates for each day for several years and different categories), which makes…
alex_555
  • 1,092
  • 1
  • 14
  • 27
4
votes
1 answer

Can't kill workers after running R script

I am using: R Version 3.0.1 (2013-05-16) and snowfall 1.84-4 initialized (using snow 0.3-13) on an m2.2xl AWS EC2 with the original AMI coming from http://www.louisaslett.com/RStudio_AMI/ . My problem is that after creating a cluster using:…
James Tobin
  • 3,070
  • 19
  • 35
4
votes
2 answers

R Snowfall - Difficulty in implementing functions that call other functions

I am trying to teach myself how to use the Snowfall package, and I have run into the following problem when I try a function that calls a second function (this is a simplified use case of what I ultimately want to implement). I currently…
billelev
  • 369
  • 2
  • 13
1
2 3 4 5