0

I have a very simple question, for which I could not find any answer. For an example I want to create, I want to give the following data.table a column with random years within a certain range say 2004-2010.

library(data.table)
set.seed(1)
DT <- data.table(panelID = sample(50,50),                                                    # Creates a panel ID
                      Country = c(rep("Albania",30),rep("Belarus",50), rep("Chilipepper",20)),       
                      some_NA = sample(0:5, 6),                                             
                      some_NA_factor = sample(0:5, 6),         
                      Group = c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20)),
                      norm = round(runif(100)/10,2),
                      Income = round(rnorm(10,-5,5),2),
                      Happiness = sample(10,10),
                      Sex = round(rnorm(10,0.75,0.3),2),
                      Age = sample(100,100),
                      Educ = round(rnorm(10,0.75,0.3),2))           
DT [, uniqueID := .I]                                                                        # Creates a unique ID     
DT[DT == 0] <- NA                                                                            # https://stackoverflow.com/questions/11036989/replace-all-0-values-to-na
DT$some_NA_factor <- factor(DT$some_NA_factor)
Tom
  • 2,173
  • 1
  • 17
  • 44

1 Answers1

2

We can use sample to select random years between 2004:2010 with replace = TRUE.

library(data.table)
DT[, random_year := sample(2004:2010, .N, replace = TRUE)]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213