1

Can purrr simulate a random walk in fewer lines of code than base R?

For example what would the following look like in purrr?

n <- 5
x <- numeric(n)
for(i in 2:n){
  x[i] <- x[i-1] + rnorm(1)
}

Edit: @Imo had exactly what I was looking for. To take this one step further I am assuming a random walk can be created and plotted with the following code

tibble(x = 1:1000, y = cumsum(rnorm(1000, mean = 0))) %>% 
  ggplot(aes(x=x,y=y))+
  geom_point()+
  geom_line()

enter image description here

Alex
  • 2,603
  • 4
  • 40
  • 73
  • 22
    In one vectorized line: `cumsum(rnorm(5))` (in base R) or `c(0, cumsum(rnorm(5)))` to start at 0. – lmo Aug 25 '17 at 14:41
  • 2
    @d.b `set.seed(42); replicate(5, rnorm(1)); set.seed(42); rnorm(5)` – Roland Aug 25 '17 at 15:03
  • 2
    You want https://codegolf.stackexchange.com/ for such navel-contemplation exercises. – Spacedman Aug 25 '17 at 15:26
  • 1
    @d.b In terms of speed, it is faster to run it all at once. This is because you go immediately into the c code once and produce all of the values. In the second instance, the function must be initiated 5 separate times. On 1000 draws, `microbenchmark` returned median times (microseconds) of 69.5 for `rnorm(1000)`, verses 35629.9 for `lapply` with `rnorm(1)`, and 1994.7 `for` loop with `rnorm(1)` for 50 replications. I'm using MS R open 3.2.5. – lmo Aug 25 '17 at 15:32
  • @lmo Timings might look very different with more recent versions of R due to the JIT byte code compiler. – Roland Aug 25 '17 at 15:44
  • @Roland Agreed. That's the main reason I mentioned the version. Unfortunately, state of the art is not available to me at the moment. If I remember, I'll run this at home to see if there are any changes. Even here, the `for` loop crushed `lapply`. – lmo Aug 25 '17 at 15:48
  • 2
    @Roland mistake on my part. My original `lapply` made calls to `rnorm` that along with the the iteration. `rnorm(1)`, `rnorm(2)`, ... and so on. When I replaced this with an anonymous function, `lapply(seq_len(1000), function(x) rnorm(1))`, it ran just as fast, maybe a bit faster than the `for` loop (2328.6 versus 2402.4 at the median measure). – lmo Aug 25 '17 at 16:13
  • @Imo It looks like `purrr` has something similar `map(seq_len(1000), function(x) rnorm(1))`. I am not sure if `map` is any faster than `lapply` ` – Alex Aug 25 '17 at 19:52
  • 5
    @Alex Why would it be? lapply is strongly optimized for speed. It's the repeated calls to an R function that take time, in particular if that R function isn't a primitive. I don't see how that could be improved, except by byte code compilation. – Roland Aug 26 '17 at 06:07
  • does anyone know the libraries that are used to make this "style" of graph? – stats_noob Mar 03 '22 at 23:14

0 Answers0