0

I would like to create a new data.frame for each unique name in the "names" column( e.g. data from all the "A's" would be in a new data.frame called "A". Sample code:

names=rep(letters[1:5],3)
numbers=runif(1:15)
results=runif(1:15)
data=data.frame(cbind(names,numbers,results))
head(data)
  names            numbers           results
1     a  0.931642575189471 0.911950665991753
2     b  0.512219588505104 0.910418740706518
3     c  0.553855121368542 0.351860193535686
4     d  0.188271699007601  0.93018054170534
5     e  0.751723999157548 0.290643541608006
W148SMH
  • 152
  • 1
  • 11
  • 2
    `split(data, f = data$names)`. This will put it in a list (which is generally best). – Gregor Thomas Jul 18 '16 at 21:56
  • 2
    If you *really* need data frames in your environment rather than a list of data frames, see `?list2env`. But you're almost certainly better off sticking with the list. – Gregor Thomas Jul 18 '16 at 22:00
  • Thanks Gregor. Split worked well. can I apply functions to those individual parts now that they are split? – W148SMH Jul 18 '16 at 22:16
  • Sure, that's what `lapply` is for (or `sapply`, or the `purrr` package, or just a for loop...). – Gregor Thomas Jul 18 '16 at 22:17
  • I'm not used to working with lists. To lapply mean to "numbers" column from "f" 'lapply(f$numbers,mean)' doesn't work – W148SMH Jul 18 '16 at 22:40
  • So, let's assume you assigned your list to something, say `datlist = split(data, f = data$names)`. Then to get those means `lapply(datlist, function(x) mean(x$numbers))`. `**l**apply` will always return a **l**ist, `**s**apply` will attempt to **s**implify the resulting structure, compare with `sapply(datlist, function(x) mean(x$numbers))`. – Gregor Thomas Jul 18 '16 at 22:45
  • Though if this is the type of thing you're doing, `dplyr`, `data.table`, or even `base::aggregate` will be nicer. They let you work on groups within a data frame - the splitting is unnecessary. E.g., `aggregate(numbers ~ names, data = data, FUN = mean)` or `library(dplyr); group_by(data, names) %>% summarize(number_mean = mean(numbers))`. – Gregor Thomas Jul 18 '16 at 22:45
  • These responses were super helpful. Thank you. – W148SMH Jul 18 '16 at 22:57

0 Answers0