1

I need to create a variable in my df where I assign a unique sequential value based on results of split. I have been searching and I found that split() can help me. However I am stuck on how to assign the sequential value.

a simplified form of my data is as

structure(list(Year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 
2014L, 2014L), Session = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = "July", class = "factor"), SiteName = structure(c(2L, 
2L, 1L, 1L, 4L, 4L, 3L, 3L), .Label = c("Kaoshe", "Matoa", 
"Livingi", "Sedina"), class = "factor"), Temp = c(23L, 12L, 15L, 
27L, 30L, 21L, 21L, 21L)), .Names = c("Year", "Session", "SiteName", 
"Temp"), class = "data.frame", row.names = c(NA, -8L))

I did temp<-split(df, df[,c("SiteName","Session","Year")])

I want the identity to be placed in another variable (df$order) in which every row in first split will a value of one and the second 2 and the 3rd three and so on. I am relatively new in R and I cant do the looping.

my desired output will be like

Year    Session SiteName    Temp    order
2014    July    Matoa   23  1
2014    July    Matoa   12  1
2014    July    Kaoshe  15  2
2014    July    Kaoshe  27  2
2014    July    Sedina  30  3
2014    July    Sedina  21  3
2014    July    Livingi 21  4
2014    July    Livingi 21  4
Taw
  • 53
  • 6

1 Answers1

0

We can use .GRP from data.table

library(data.table)
setDT(df)[, order := .GRP, .(SiteName, Session, Year)]

Or with base R

df$order <- cumsum(!duplicated(df[1:3]))
df$order
#[1] 1 1 2 2 3 3 4 4
akrun
  • 874,273
  • 37
  • 540
  • 662