Add a column with values ranging from 1 to number of occurrence of values of a variable

Question

I have a dataframe as displayed below:

Transaction#    ID

101             ABC

101             EFG

101             IJK

102             LMO

102             PQR

103             STU

I want to add one more column which will have values as mentioned below. Basically, it should write the column value as numbers starting from 1 to the number of times similar Transaction# is repeated.

Transaction# ID Number

101          ABC  1

101          EFG  2

101          IJK  3

102          LMO  1

102          PQR  2

103          STU  1

Got a reference from another similar question. It can be achieved using data.table package which is very useful for me as my dataset is huge. dt[, Num := seq_len(.N), by = Transaction] — shekharamit, Mar 29 '18 at 11:57

Rana Usman · Answer 1 · 2018-03-29T11:06:48.067

2

Another way to go about it would be.

df <- transform(df, Number=ave(Transaction, Transaction, FUN=seq_along))

Result

 Transaction  ID     Number
         101 ABC        1
         101 EFG        2
         101 IJK        3
         102 LMO        1
         102 PQR        2
         103 STU        1

Data

structure(list(Transaction = c(101L, 101L, 101L, 102L, 102L, 
103L), ID = structure(1:6, .Label = c("ABC", "EFG", "IJK", "LMO", 
"PQR", "STU"), class = "factor")), .Names = c("Transaction", 
"ID"), class = "data.frame", row.names = c(NA, -6L))

edited Mar 29 '18 at 11:06

answered Mar 29 '18 at 11:01

Rana Usman

1,031
7
21

1

You don't need the first `seq_along` – moodymudskipper Mar 29 '18 at 11:04
1

I agree @Moody_Mudskipper, I edited. – Rana Usman Mar 29 '18 at 11:07
Thanks, it worked well. – shekharamit Mar 29 '18 at 11:17

score 1 · Answer 2 · answered Mar 29 '18 at 10:58

1

With base R you can do the following.

dat$Number <- ave(dat$Transaction, dat$Transaction, FUN = function(x) seq.int(1, length(x)))
#  Transaction  ID Number
#1         101 ABC      1
#2         101 EFG      2
#3         101 IJK      3
#4         102 LMO      1
#5         102 PQR      2
#6         103 STU      1

DATA.

dat <- read.table(text = "
Transaction    ID
101             ABC
101             EFG
101             IJK
102             LMO
102             PQR
103             STU
", header = TRUE)

answered Mar 29 '18 at 10:58

Rui Barradas

70,273
8
34
66

1

Or just `ave(dat$Transaction, dat$Transaction, FUN=seq_along)`, (maybe slightly less efficient but not even sure) – moodymudskipper Mar 29 '18 at 11:03
@Moody_Mudskipper You are right, I completely forgot that one. – Rui Barradas Mar 29 '18 at 11:25

score 0 · Answer 3 · answered Mar 29 '18 at 10:51

With dplyr you can group by Transaction and then use mutate on each group to create the desired vector.

library(dplyr)
df <- data.frame("Transaction" = c(101, 101, 101, 102 ,102 ,103),
                 "ID" = c("ABC", "EFG", "IJK", "LMO", "PQR", "STU"))

df %>% 
  group_by(Transaction) %>%
  mutate(N = 1:n())

# A tibble: 6 x 3
# Groups:   Transaction [3]
  Transaction ID        N
        <dbl> <fct> <int>
1        101. ABC       1
2        101. EFG       2
3        101. IJK       3
4        102. LMO       1
5        102. PQR       2
6        103. STU       1

you can also use `row_number()` instead of `1:n()` – Roman Mar 29 '18 at 11:03 — Roman, Mar 29 '18 at 11:03

Add a column with values ranging from 1 to number of occurrence of values of a variable

3 Answers3