0

I would like to create time use variables in R using UKTUS database.

UKTUS records people activities for every 10 minutes defined by the dataset as act1_1, act1_2,...,act1_144 variables (144 x 10 minutes).

Each activity is broken down to different coding scheme. For example SLEEP is coded as follows:

110 Sleep 111 In bed not asleep 120 Sick in bed

I created a matrix in R with 129 columns and 16533 rows.

Activities <-uktus15_diary_wide[,c ("serial", "pnum","ddayw","DVAge", "dmonth", "dyear","WhenDiary","AfterDiaryDay","WhereStart","WhereEnd","RushedD","Ordinary","KindOfDay","Trip","enjm1","act1_1, "act1_2", "act1_3", "act1_4", "act1_5", "act1_6", "act1_7", "act1_8", "act1_9", "act1_10",                               "act1_11", "act1_12", "act1_13", "act1_14","act1_15", "act1_16", "act1_17", "act1_18", "act1_19", "                                   "act1_21", "act1_22", "act1_23", "act1_24", "act1_25", "act1_26", "act1_27", "act1_28", "act1_29", "act1_30",
                                    "act1_31", "act1_32", "act1_33", "act1_34", "act1_35", "act1_36", "act1_37", "act1_38", "act1_39", "act1_40",
                                    "act1_41", "act1_42", "act1_43", "act1_44", "act1_45", "act1_46", "act1_47", "act1_48", "act1_49", "act1_50",
                                    "act1_51", "act1_52", "act1_53", "act1_54", "act1_55", "act1_56", "act1_57", "act1_58", "act1_59", "act1_60",
                                    "act1_61", "act1_62", "act1_63", "act1_64", "act1_65", "act1_66", "act1_67", "act1_68", "act1_69", "act1_70",
                                    "act1_71", "act1_72", "act1_73", "act1_74", "act1_75", "act1_76", "act1_77", "act1_78", "act1_79", "act1_80",
                                    "act1_81", "act1_82", "act1_83", "act1_84", "act1_85", "act1_86", "act1_87", "act1_88", "act1_89", "act1_90",
                                    "act1_91", "act1_92", "act1_93", "act1_94", "act1_95", "act1_96", "act1_97", "act1_98", "act1_99", "act1_100",
                                    "act1_101", "act1_102", "act1_103", "act1_104", "act1_105", "act1_106", "act1_107", "act1_108", "act1_109",
                                    "act1_110", "act1_111", "act1_112", "act1_113", "act1_114")]

For TV variable I generated the code in Stata but I don't know how to rewrite for R. Could somebody help me please?

I have 144 time steps; and activities between 8209 and 8230 are related to TV watching.

Stata commands for radio/tv variables are the following:

generate tv = 0
generate radio = 0
forvalues i = 1/144 {
replace tv = tv+10 if (act1_`i’ > 8209 & act1_`i’ < 8230)
replace radio = radio+10 if (act1_`i’ > 8229 & act1_`i’ < 8321)
}
Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • 1
    Can you give a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of what you are trying to achieve? – markus Aug 15 '18 at 06:51
  • I've added the `[stata]` tag, perhaps it will bring somebody with both skills to the question. – r2evans Aug 15 '18 at 06:52
  • @markus thank you fro your help. I would like to have a variable called TV that is defined by the following codes: TV AND VIDEO; 8210 Unspecified TV video or DVD watching; 8211 Watching a film on TV; 8212 Watching sport on TV; 8219 Other specified TV watching; 8220 Unspecified video watching; 8221 Watching a film on video; 8222 Watching sport on video; 8229 Other specified video watching; – RforDummies Aug 15 '18 at 06:57
  • 2
    So in Stata you have 144 variables `act1_1`, `act1_2` to `act1_144` and you want to how many times they are in specified ranges. Your first steps include explaining clearly how you are holding the data in R. Translate to something much simpler as no-one who knows the answer (not me; I am a Stata person) wants or needs an example with 144 columns. Edit your question; don't add crucial detail in comments. – Nick Cox Aug 15 '18 at 07:02

1 Answers1

0

Here is R syntax for loop and if statements you have specified. +15 used as you activities column starts in db after 15 other columns.

tv <-0
radio <-0
for (i in 1:144){
  tv <- ifelse(Activities[, i+15]>8209 & Activities[, i+15]<8230, tv+10, tv)
  radio <- ifelse(Activities[, i+15]>8229 & Activities[, i+15]<8321, radio+10, radio)
}
Nar
  • 648
  • 4
  • 8
  • Dear Nar, thank you for the code. Unfortunately when I compile I receive a column names act1_144 with values ranging from 0 to 1100 minutes. However when I am running a statistical test in Stata the TV variables mean and standard deviation do not match. In stata the standard deviation for TV is 233.8124 and the mean is 295.4556. In R the standard deviation for TV is 128. 5139 and the mean is 138.5441. Could you help me with this please. Is this due the sum command? Thank you – RforDummies Aug 15 '18 at 08:47
  • Is this because of the Activity matrix definition; a missing for loop - it reads only the 144th column and not all the Activity matrix? Thank you? – RforDummies Aug 15 '18 at 09:12
  • can you pls give code how you calculate mean and st dev. thank you – Nar Aug 15 '18 at 09:33
  • Dear Nar, Basically i created a do file in Stata with the following code generate tv = 0 generate radio = 0 forvalues i = 1/144 { replace tv = tv+10 if (act1_`i’ > 8209 & act1_`i’ < 8230) replace radio = radio+10 if (act1_`i’ > 8229 & act1_`i’ < 8321) }. After this is run the summary TV Stata command to receive the mean and standard deviation – RforDummies Aug 15 '18 at 09:51
  • Their Nar, thank you for your help - the code is good . If you have time could you help me with 2 things: (1) instead of act1_144 how to name the variable to be TV or RADIO (2) how to group the TV and Radio by their identifiers - serial and pnum variables. Thank you – RforDummies Aug 15 '18 at 11:19