Transforming row-wise data into columnwise data in r

Question

I have a file in which clickstreams are stored in csv format. The data looks like this:

Row 1. User1 - Click1

Row 2. User1 - Click2

Row 3. User1 - Click3

Row 4. User2 - Click1

Row 5. User3 - Click1

Row 6. User3 - Click2

and so on

Is there a function in r to give the data the following form

Row 1. User1- Click1 - Click2 - Click3

Row 2. User2 - Click1

Row 3. User3 - Click1 - Click2

Thanks

How your data looks like is not very useful. We need to know the exact data structure. Please read [this FAQ](http://stackoverflow.com/a/5963610/1412059). You should also show some of your own efforts of solving this. — Roland, Jul 29 '15 at 11:38

ulfelder · Answer 1 · 2015-07-29T12:19:59.337

1

library(reshape2)
df <- data.frame(user = rep(LETTERS[1:3], each = 3), click = rep(1:3, times = 3))
dfmelt <- melt(df, id = "user")
dfcast <- dcast(dfmelt, user ~ variable + value)

Here's the toy data:

> df
  user click
1    A     1
2    A     2
3    A     3
4    B     1
5    B     2
6    B     3
7    C     1
8    C     2
9    C     3

Here's the result:

> dfcast
  user click_1 click_2 click_3
1    A       1       2       3
2    B       1       2       3
3    C       1       2       3

You can also do this in one line, but you won't get the nice column names:

> dcast(df, user ~ click)

  user 1 2 3
1    A 1 2 3
2    B 1 2 3
3    C 1 2 3

edited Jul 29 '15 at 12:19

answered Jul 29 '15 at 11:41

ulfelder

5,305
1
22
40

Thanks ulfelder. The issue in this case is that I cant set the value for the number of clicks as 3 as the number of clicks vary by each user – Vaibhav Srivastava Jul 29 '15 at 12:42
The number of clicks doesn't have to be constant across users for this to work. If the numbers are uneven, `dcast()` will put NAs in the extras. So if user A has n clicks and user B has n - 2, you'll get NAs in the last two columns for user B. In other words, it will do the same thing that `splitstackshape` does under those conditions. – ulfelder Jul 29 '15 at 12:46

score 1 · Answer 2 · answered Jul 29 '15 at 11:53

This can be one option

library(splitstackshape)
cSplit(setDT(df)[, toString(V4), by='V3'], 'V1', ',')

#      V3    V1_1    V1_2    V1_3
#1: User1 -Click1 -Click2 -Click3
#2: User2 -Click1      NA      NA
#3: User3 -Click1 -Click2      NA

data

df = structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Row", class = "factor"), 
    V2 = c(1, 2, 3, 4, 5, 6), V3 = structure(c(1L, 1L, 1L, 2L, 
    3L, 3L), .Label = c("User1", "User2", "User3"), class = "factor"), 
    V4 = structure(c(1L, 2L, 3L, 1L, 1L, 2L), .Label = c("-Click1", 
    "-Click2", "-Click3"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4"), class = "data.frame", row.names = c(NA, -6L
))

Thanks Veerendra. No the data is not a data frame. The number of clicks vary from 1 to something like 10,000 for any user — Vaibhav Srivastava, Jul 29 '15 at 12:33

score 0 · Answer 3 · answered Jul 29 '15 at 13:33

Having this data frame, using the reshape function:

   user   click
1 User1 -Click1
2 User1 -Click2
3 User1 -Click3
4 User2 -Click1
5 User3 -Click1
6 User3 -Click2

df$n <- df$click
reshape(df, idvar="user", timevar="click" ,direction="wide")

Output:

   user n.-Click1 n.-Click2 n.-Click3
1 User1   -Click1   -Click2   -Click3
4 User2   -Click1      <NA>      <NA>
5 User3   -Click1   -Click2      <NA>

Transforming row-wise data into columnwise data in r

3 Answers3