1

I am wanting to create a individualized data sets partitioned by a categorical variable within a data frame.

I am wanting to take something like this and apply a function that will give me three different data sets

head(df,10)

> color    value
> red      1
> red      2
> red      3
> blue     1
> blue     2
> blue     3
> green    1
> green    2
> green    3
> green    4

I imagine I would be using the assign() function to some kind of use like:

assign(paste0("color-",df$color), df$value

Ideally I would like to have

color-red

> value
> 1
> 2
> 3

color-blue

> value
> 1
> 2
> 3

etc...
Reid Williams
  • 303
  • 1
  • 5
  • 15
  • 1
    Have you looked at `split()`? It sounds like that could be what you're after (although it stores in a list and not as separate objects). – aosmith Oct 02 '19 at 21:38
  • I'll look into it! storing into a list is fine because I can iterate through that pretty nicely! – Reid Williams Oct 02 '19 at 21:41

1 Answers1

2

You don't want to do this with assign. The way we do this in R is that we make a list of data frames:

df <- data.frame(color = rep(c("red","blue","green"),each = 3),
                 value = rep(1:3,times = 3),
                 stringsAsFactors = FALSE)

df_split <- split(x = df,f = df$color)

> df_split[["blue"]]
  color value
4  blue     1
5  blue     2
6  blue     3

Each data frame can be referenced by name via df_split[["green"]], etc. or by position, df_split[[1]]. Keeping all the data in a single structure is more convenient and will avoid problems later on when you inevitably want to perform actions on all, or groups, of these data frames.

joran
  • 169,992
  • 32
  • 429
  • 468