0

I have the following dataframe:

df=data.frame(term=c("hello","affirms","allows","hello","always","allows","allows","affirms"),
       class=c("class 1","class 1","class 2", "class 2", "class 2","class 3","class 4","class 
4"),stringsAsFactors = FALSE)
df
     term   class
1   hello class 1
2 affirms class 1
3  allows class 2
4   hello class 2
5  always class 2
6  allows class 3
7  allows class 4
8 affirms class 4

I would like to obtain a list like this:

combinations <- list(
  hello   = c("class 1", "class 2"),
  affirms = c("class 1", "class 4"),
  allows  = c("class 2", "class 3", "class 4"),
  always  = c("class 2")
)

The solution presented in Split data.frame based on levels of a factor into new data.frames won't suit my problem because, if I applied the accepted answer of the mentioned question:

X <- split(df, df$class)
Y <- lapply(seq_along(X), function(x) as.data.frame(X[[x]])[, 1]) 
names(Y) <- c("class 1", "class 2", "class 3", "class 4")
list2env(Y, envir = .GlobalEnv)
`class 1`

I obtain

"hello"   "affirms"

which is not the desired result. Anyway, I tried to modify the code in this way:

X <- split(df, df$term)
Y <- lapply(seq_along(X), function(x) as.data.frame(X[[x]])[, 2]) 

The result seems to be near to the desired one, but:

Y
[[1]]
[1] "class 1" "class 4"

[[2]]
[1] "class 2" "class 3" "class 4"

[[3]]
[1] "class 2"

[[4]]
[1] "class 1" "class 2"

How do I know that, for example, Y[[4]] corresponds to hello, while Y[[2]] corresponds to allows?

Mark
  • 1,577
  • 16
  • 43

2 Answers2

1

Use split :

split(df$class, df$term)

#$affirms
#[1] "class 1"   "class 4"

#$allows
#[1] "class 2" "class 3" "class 4"

#$always
#[1] "class 2"

#$hello
#[1] "class 1" "class 2"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can use group_split

library(dplyr)
df %>%
    group_split(term)
akrun
  • 874,273
  • 37
  • 540
  • 662