1

I have data as follows :

User_Id      Website      Day
A        Google    Monday
A        Facebook   Tuesday
A        Linkedin    Wednesday
B        Facebook   Tuesday
B       Linkedin    Wednesday

I want to achieve something like this:

User_ID   Google  Facebook  Linkedin  Monday  Tuesday  Wednesday
A        1       1         1          1       1        1
B        0       1         1          0       1        1

The columns now represent the number of times it is appearing for each users. How can I do this in R ?

zx8754
  • 52,746
  • 12
  • 114
  • 209

2 Answers2

2

We unlist the 2nd and 3rd column of data.frame (unlist(df1[-1]) and replicate the 1st column by the number of other columns i.e. in this case 2 (rep(df1[,1], 2)), get the frequency count with table and convert to a data.frame (as.data.frame.matrix).

as.data.frame.matrix(table(rep(df1[,1],2), unlist(df1[-1])))
#  Facebook Google Linkedin Monday Tuesday Wednesday
#A        1      1        1      1       1         1
#B        1      0        1      0       1         1

If we need a package solution, another option is dplyr/tidyr. Reshape the 'wide' to 'long' format with gather (from tidyr), get the frequency count and spread back to 'wide' format.

library(dplyr)
library(tidyr)
df1 %>%
    gather(Var, Val, -User_Id) %>%
    count(User_Id, Val) %>% 
    spread(Val, n, fill = 0)   
#   User_Id Facebook Google Linkedin Monday Tuesday Wednesday
#    <chr>    <dbl>  <dbl>    <dbl>  <dbl>   <dbl>     <dbl>
#1       A        1      1        1      1       1         1
#2       B        1      0        1      0       1         1
akrun
  • 874,273
  • 37
  • 540
  • 662
2

An option with reshape2::recast which basically first converts all the columns to a long format by User_Id and then spreads back according to User_Id again

library(reshape2)
recast(df, User_Id ~ value, id.var = "User_Id", length)
#   User_Id Facebook Google Linkedin Monday Tuesday Wednesday
# 1       A        1      1        1      1       1         1
# 2       B        1      0        1      0       1         1
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • Thanks a lot David. This worked too. Can I ask you how does it strike at first instance that this can be achieved using this particular package and then the code of it ? – Rohit Kadam Aug 02 '16 at 12:20
  • 1
    Usually by experience. This type of question was asked many times already on SO. Usually when you are trying to reshape your data you should look into `reshape2`/`tidyr`/`data.table` packages and Google "reshape R" or "long to wide r" and similar. Or just look into akruns answers :) – David Arenburg Aug 02 '16 at 12:24
  • Hey David, whenever I use this, there is a column next to userid named "Var.2" that creeps in. Any reason for this ? – Rohit Kadam Aug 02 '16 at 12:57
  • I can't reproduce this. – David Arenburg Aug 02 '16 at 12:58