-1

I have a tidy data frame, with one term and topic per row. It looks like this:

num_topic, term
1, blue
1, green
2, dog
2, cat

I would like to arrange each topic in a separate column, for human readability

topic1, topic2
blue, dog
green, cat

This seems pretty intuitive, but I cannot figure out how to do it. It is not the same as the linked question, because there are no unique identifiers for each term. There are just lists of terms for each topic.

Adam_G
  • 7,337
  • 20
  • 86
  • 148
  • This is called a pivot – Keith Nov 17 '17 at 03:00
  • Thank you, that's very close! But the problem is, I just want columns of topics. There is really no unique identifier for each term in the topic. – Adam_G Nov 17 '17 at 03:04
  • @Adam_G Can you add a temporary identifier then? e.g. something like `df %>% group_by(num_topic) %>% mutate(id = seq(1, n())) %>% ungroup() %>% spread(num_topic, term) %>% select(-id)` – Z.Lin Nov 17 '17 at 03:15

1 Answers1

2

How about the following. Note this assumes that entries per num_topic group are "in order", i.e. blue belongs to dog, green belongs to cat, and so on.

df <- read.csv(text = 
    "num_topic,term
    1,blue
    1,green
    2,dog
    2,cat");

df <- as.data.frame(sapply(split(df, df$num_topic), function(x) x$term));
df;
#       1    2
#1   blue  dog
#2  green  cat

Explanation: Split on num_topic and bind columns term together.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68