I'm currently working on translating some commands for time-series data in Stata into R. I'm using the zoo
package to calculate moving averages in R. Here is what my data looks like:
data <- cbind(c(1960:1970), c(95.5, 95.3, 95.3, 95.7, 95.7, 95.7,
95.1, 95.1, 95.1, 95, 95))
[,1] [,2]
[1,] 1960 95.5
[2,] 1961 95.3
[3,] 1962 95.3
[4,] 1963 95.7
[5,] 1964 95.7
[6,] 1965 95.7
[7,] 1966 95.1
[8,] 1967 95.1
[9,] 1968 95.1
[10,] 1969 95.0
[11,] 1970 95.0
I'll make this into a data.frame
:
data <- as.data.frame(data)
Now, I can use the rollmean
function to calculate the moving averages for turnout
with my data:
data$turnout <- rollmean(data[,2], 1, fill = NA)
And this is what I get:
V1 V2 turnout
1 1960 95.5 95.5
2 1961 95.3 95.3
3 1962 95.3 95.3
4 1963 95.7 95.7
5 1964 95.7 95.7
6 1965 95.7 95.7
7 1966 95.1 95.1
8 1967 95.1 95.1
9 1968 95.1 95.1
10 1969 95.0 95.0
11 1970 95.0 95.0
This is all well and good, but my issue is that I want my column turnout
(moving average) to start at 1961 instead of 1960. This code does not exclude the first observation, which is what I am trying to do.
For reference, the equivalent Stata command would be:
tssmooth ma m1turnout = turnout, window (1 0)
I have already tried using the align = "right"
function, but that doesn't seem to do the trick. Any ideas?
Thanks in advance!
Edit--to clarify, I'm doing this across different lengths. In Stata the full code is as such, where since
is a variable that describes the number of years since an intervention.
foreach y of numlist 1(1)10{
tssmooth ma m`y'turnout = turnout, window (`y' 0)
}
gen dvturnout=.
foreach y of numlist 2(1)9{
replace dvturnout = l1.turnout if since==1
replace dvturnout = m`y'turnout if since==`y' & m`y'turnout!=.
replace dvturnout = m10turnout if (since==10 & m10turnout!=.) | (since==. & redist!=. & m10turnout!=.)
}
foreach y of numlist 1(1)10{
drop m`y'turnout
}
My ultimate goal is this dvturnout
variable.
When I try what I presume corresponds to the first section of the code in Stata, that is:
foreach y of numlist 1(1)10{
tssmooth ma m`y'turnout = turnout, window (`y' 0)
}
In R, I do this (where [,35]
is the column I'm starting to add variables to):
for (j in 1:10) {
data_countries[[i]][,35+j] <- rollmean(data_countries[[i]][,13], j, fill = NA, align = "right")
}
}
And it spits out this for me:
year since V36 V37 V38 V39 V40 V41 V42 V43 V44 V45
1 1960 NA 95.5 NA NA NA NA NA NA NA NA NA
2 1961 NA 95.3 95.40 NA NA NA NA NA NA NA NA
3 1962 NA 95.3 95.30 95.36667 NA NA NA NA NA NA NA
4 1963 NA 95.7 95.50 95.43333 95.450 NA NA NA NA NA NA
5 1964 NA 95.7 95.70 95.56667 95.500 95.50 NA NA NA NA NA
6 1965 NA 95.7 95.70 95.70000 95.600 95.54 95.53333 NA NA NA NA
7 1966 NA 95.1 95.40 95.50000 95.550 95.50 95.46667 95.47143 NA NA NA
8 1967 NA 95.1 95.10 95.30000 95.400 95.46 95.43333 95.41428 95.4250 NA NA
9 1968 NA 95.1 95.10 95.10000 95.250 95.34 95.40000 95.38571 95.3750 95.38889 NA
10 1969 NA 95.0 95.05 95.06667 95.075 95.20 95.28333 95.34286 95.3375 95.33333 95.35
11 1970 NA 95.0 95.00 95.03333 95.050 95.06 95.16667 95.24286 95.3000 95.30000 95.30
These numbers are all fine, but they're "shifted" down from where I want them to be. Here is what the same operation gives me in Stata:
year dvturnout m1turnout m2turnout m3turnout m4turnout m5turnout m6turnout m7turnout m8turnout m9turnout m10turnout
1960
1961 95.5 95.5 95.5 95.5 95.5 95.5 95.5 95.5 95.5 95.5
1962 95.3 95.4 95.4 95.4 95.4 95.4 95.4 95.4 95.4 95.4
1963 95.3 95.3 95.36667 95.36667 95.36667 95.36667 95.36667 95.36667 95.36667 95.36667
1964 95.7 95.5 95.43333 95.45 95.45 95.45 95.45 95.45 95.45 95.45
1965 95.7 95.7 95.56667 95.5 95.5 95.5 95.5 95.5 95.5 95.5
1966 95.7 95.7 95.7 95.6 95.54 95.53333 95.53333 95.53333 95.53333 95.53333
1967 95.1 95.39999 95.5 95.55 95.5 95.46667 95.47143 95.47143 95.47143 95.47143
1968 95.1 95.1 95.3 95.39999 95.46 95.43333 95.41428 95.425 95.425 95.425
1969 95.1 95.1 95.1 95.25 95.34 95.39999 95.38571 95.375 95.38889 95.38889
1970 95 95.05 95.06667 95.075 95.2 95.28333 95.34286 95.3375 95.33334 95.35