0

I have a list of data.frames and would like to construct a new data.frame from the list like so:

 u=runif(2, 0, 1)
 u.obs=list(data.frame(site='dl',  
                       swe.obs=runif(4, 0, 1),
                       model.type='r'),
            data.frame(site='nt', 
                       swe.obs=runif(5, 0, 1),
                       model.type='lm'),
            data.frame(site='nt',
                       swe.obs=runif(3,0,1),
                       model.type='lm'),
            data.frame(site='nt',
                       swe.obs=runif(3,0,1),
                       model.type='r'))

EDIT: @dickoa gave an answer that worked for my example but not for real so I am adding to u.obs to make it more real.

EDIT2: Just kidding. it looked different, but is the same from what I can tell.

summ.df=data.frame(model=u,
                   obs.min=laply(u.obs$swe.obs, min), 
                   obs.max=laply(u.obs$swe.obs, max), 
                   obs.mean=laply(u.obs$swe.obs, mean),
                   site=laply(u.obs$site, '[', 1),
                   model.type=laply(u.obs$model.type, '[', 1), 
                   date=laply(u.obs$date, '[', 1))

but I can't extrct site and model.type even though u.obs[[1]]$site[1] works fine. Can someone assist me? Thanks

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
Dominik
  • 782
  • 7
  • 27

2 Answers2

1

Number 1: use spaces and carriage returns in your code. It will help with debugging for you and for us!

Number 2: your question:

Look at the results of your first few laply calls:

laply(u.obs$swe.obs, min)
# logical(0)

This is because u.obs$swe.obs doesn't exist. Instead you want u.obs[[i]]$swe.obs. You can get there using an anonymous function, or the amazingly handy summarise.

laply(u.obs, summarise, min(swe.obs))

Now that your later assignments are not 0, you will get the result you expected. However, The excellent thing about summarise and plyr, is that you don't have to build the data.frame like that. Instead, use ldply

summ.df <- ldply(u.obs, 
                 summarise,
                 obs.min=min(swe.obs),
                 site=site[1])
Justin
  • 42,475
  • 9
  • 93
  • 111
  • awesome, thanks. It's incredible what R and people who know how to use it can do. I swear I tried `laply(u.obs$swe,min)` itself but I must have changed my data structure or something. Thanks for your help. I had to follow it with `summ.df<-rbind(model=u,summ.df)` to get what I wanted. – Dominik Jul 25 '13 at 19:35
  • oh, and thanks for the heads up on summarise! didn't know of that one. – Dominik Jul 25 '13 at 19:37
1

If your final data has the same structure it will be easier to change your approach by binding (row-wise) your data first.

Using your data

set.seed(1)
u <- runif(2, 0, 1)
u.obs <- list(
data.frame(site='dl',  
swe.obs=runif(4, 0, 1),
model.type='r'),
data.frame(site='nt', 
           swe.obs=runif(5, 0, 1),
           model.type='lm'))

We can something like this

require(plyr)
ddply(do.call(rbind, u.obs), .(site, model.type), summarise,
      obs.min = min(swe.obs), 
      obs.max = max(swe.obs), 
      obs.mean = mean(swe.obs))

##   site model.type  obs.min obs.max obs.mean
## 1   dl          r 0.201682 0.90821  0.64528
## 2   nt         lm 0.061786 0.94468  0.50047
dickoa
  • 18,217
  • 3
  • 36
  • 50
  • fyi, from reading the link from @user2510479 it seems `rbind.fill()` might be a better choice to `do.call(rbind,list)` – Dominik Jul 25 '13 at 20:22
  • This actually makes all my code shorter. I did a bunch of processing leading up to my question that can be eliminated if i just use `ddply(original.df,.(site,model.type),summarise,...)`. when will i learn. – Dominik Jul 25 '13 at 20:36
  • @Dominik Glad to help. While you're learning, take a look at `data.table`. The syntax will be a little different at first, but it is amazingly fast once you've groked it. – Justin Jul 25 '13 at 22:27