0

I have data from an experiment that has multiple rows per item (each row has the reading time for one word of a sentence of n words), and multiple items per subject. Items can be varying numbers of rows. Items were presented in a random order, and their order in the data as initially read in reflects the sequence they saw the items in. What I'd like to do is add a column that contains the order in which the subject saw that item (i.e., 1 for the first item, 2 for the second, etc.).

Here's an example of some input data that has the relevant properties:

d <- data.frame(Subject = c(1,1,1,1,1,2,2,2,2,2), 
                Item =    c(2,2,2,1,1,1,1,2,2,2))

 Subject Item
       1    2
       1    2
       1    2
       1    1
       1    1
       2    1
       2    1
       2    2
       2    2 
       2    2

And here's the output I want:

 Subject Item order
       1    2     1
       1    2     1
       1    2     1
       1    1     2
       1    1     2
       2    1     1
       2    1     1
       2    2     2
       2    2     2
       2    2     2

I know I can do this by setting up a temp data frame that filters d to unique combinations of Subject and Item, adding order to that as something like 1:n() or row_number(), and then using a join function to put it back together with the main data frame. What I'd like to know is whether there's a way to do this without having to create a new data frame just to store the order---can this be done inside dplyr's mutate somehow if I group by Subject and Item, for instance?

Jigsaw
  • 405
  • 5
  • 11

2 Answers2

1

Here's one way:

d %>%
  group_by(Subject) %>%
  mutate(order = match(Item, unique(Item))) %>%
  ungroup()
# # A tibble: 10 x 3
#    Subject  Item order
#      <dbl> <dbl> <int>
#  1       1     2     1
#  2       1     2     1
#  3       1     2     1
#  4       1     1     2
#  5       1     1     2
#  6       2     1     1
#  7       2     1     1
#  8       2     2     2
#  9       2     2     2
# 10       2     2     2
r2evans
  • 141,215
  • 6
  • 77
  • 149
1

Here is a base R option

transform(d,
  order = ave(Item, Subject, FUN = function(x) as.integer(factor(x, levels = unique(x))))
)

or

transform(d,
  order = ave(Item, Subject, FUN = function(x) match(x, unique(x)))
)

both giving

   Subject Item order
1        1    2     1
2        1    2     1
3        1    2     1
4        1    1     2
5        1    1     2
6        2    1     1
7        2    1     1
8        2    2     2
9        2    2     2
10       2    2     2
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81