1

Suppose I have a data frame with 7 variables. I want to subset the data frame based on the contents of one column automatically. The column is Department and there are 17 different values. I would like R to look at the column "Dept" and create a new data frame for each Dept containing all other rows. This would be the equivalent of "Split Worksheet" in Minitab. For now, I have to run the subset command 17 times to create a data frame for each. Can R do this automatically based on the column content?

Best and thanks!

Bob Wainscott
  • 11
  • 1
  • 2
  • 1
    Got a sample? http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Thell Aug 15 '12 at 20:14
  • yes that will do the trick, but it outputs into the console instead of placing data for each department in its own data frame (preferably named by department) – Bob Wainscott Aug 15 '12 at 20:20
  • 1
    it makes a list of `data.frames` which you can assign just like you would. R almost never changes data out from under you, instead it manipulates it and returns it. The default return is stdout but you can use `<-` or `=` to assign to a new variable. – Justin Aug 15 '12 at 20:22
  • @BobWainscott: it's usually better to leave them in a list, especially if you're going to perform similar analyses on each data.frame. Otherwise you will likely find yourself right back in a situation where you need to run another command 17 times... – Joshua Ulrich Aug 15 '12 at 20:24
  • Thank you a ton, the data is now split. I am bit sketchy on analyzing it without seeing it in front of me as a data frame. I am using RStudio. Great website! Cheers. Actually I can feed them into a data frame by dep1 = out$dep1 – Bob Wainscott Aug 15 '12 at 20:34

1 Answers1

5
out<-split(df,df$Dept)

out[[1]]

# etc to access dataframes

or

out$Dept1

to give a concrete example

df<-data.frame(Dept=c('a','a','b','b','c','d','d'),acs=c(111,112,222,223,333,444,445))
out<-split(df,df$Dept)
out
> out
$a
  Dept acs
1    a 111
2    a 112

$b
  Dept acs
3    b 222
4    b 223

$c
  Dept acs
5    c 333

$d
  Dept acs
6    d 444
7    d 445

dept.names<-names(out)

> dept.names[1]
[1] "a"

> out[[dept.names[1]]] # dataframe for department 1
  Dept acs
1    a 111
2    a 112

> out[[dept.names[2]]] # dataframe for department 2
  Dept acs
3    b 222
4    b 223


> is.data.frame(out[[dept.names[2]]])
[1] TRUE
shhhhimhuntingrabbits
  • 7,397
  • 2
  • 23
  • 23