1

If I have a dataframe like this

structure(list(id = c(1, 1, 1, 2, 2, 2, 3, 3), text = c("Google", 
"Google", "Amazon", "Amazon", "Google", "Yahoo", "Yahoo", "Google"
)), .Names = c("id", "text"), row.names = c(NA, -8L), class = "data.frame")

How can I produce a new dataframe which contains the count of every string:

Desired output

  id Google Yahoo Amazon
1  1      2     0      1
2  2      1     1      1
3  3      1     1      0
Community
  • 1
  • 1
Nathalie
  • 1,228
  • 7
  • 20

2 Answers2

1

If the output you expect is a dataframe in the exact order of Google - Yahoo - Amazon

my_df <- table(dframe)
# text
# id  Amazon Google Yahoo
# 1      1      2     0
# 2      1      1     1
# 3      0      1     1
class(my_df)
# "table"
# -------------------------------------------------------------------------

library(reshape2)
table_df<- dcast(as.data.frame(my_df), id ~ factor(text, levels = c('Google', 'Yahoo', 'Amazon')), value.var = "Freq")
# table_df
# id Google Yahoo Amazon
# 1  1      2     0      1
# 2  2      1     1      1
# 3  3      1     1      0

class(table_df)
#[1] "data.frame"

Data

dput(dframe)
structure(list(id = c(1, 1, 1, 2, 2, 2, 3, 3), text = c("Google", 
"Google", "Amazon", "Amazon", "Google", "Yahoo", "Yahoo", "Google"
)), .Names = c("id", "text"), row.names = c(NA, -8L), class = "data.frame")
deepseefan
  • 3,701
  • 3
  • 18
  • 31
1

To complete a bit Cole comment:

 table(dframe)
   text
id  Amazon Google Yahoo
  1      1      2     0
  2      1      1     1
  3      0      1     1

table indeed does the job. It is the same as passing two arguments:

table(dframe[,1],dframe[,-1])

You can do the same with tidyr and dplyr:

library(dplyr)
library(tidyr)

dframe %>%
  group_by(id,text) %>%
  summarise(n = n()) %>%
  spread(.,text,n,fill = 0)

# A tibble: 3 x 4
# Groups:   id [3]
     id Amazon Google Yahoo
  <dbl>  <dbl>  <dbl> <dbl>
1    1.     1.     2.    0.
2    2.     1.     1.    1.
3    3.     0.     1.    1.

Or with data.table:

library(data.table)

dcast(as.data.table(dframe)[,.N,by = .(id,text)],id~text,fill = 0)

   id Amazon Google Yahoo
1:  1      1      2     0
2:  2      1      1     1
3:  3      0      1     1
denis
  • 5,580
  • 1
  • 13
  • 40