1

I am trying to impute two variables simultaneously in Stata: say y and x. And then I want to perform a linear regression for them.

The code I used are:

mi set mlong
mi register imputed y x

mi impute regress y a b c, add(10)
mi impute regress x a b c, add(10)
mi estimate: regress y x

I run into an error: "estimation sample varies between m=1 and m=11". Can someone help me out? Thanks!

Sheldon
  • 315
  • 2
  • 5
  • 13
  • Note that x and y has different number of missing values. – Sheldon Apr 19 '16 at 02:26
  • Gee, a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) would really help here. – alexwhitworth Apr 26 '16 at 17:26
  • hotdeck is one way: https://stackoverflow.com/questions/53324137/simple-way-to-perform-a-hot-deck-imputation-in-stata/ – JohnE Nov 16 '18 at 18:22

1 Answers1

0

I prefer doing it using chained equations. The code below should work (note that Part 1 can be skipped as I only used it to generate a suitable mock dataset):

* Part 1

clear all
set seed 0945 
set obs 50
gen y0 = _n 
gen y = runiform()
sort y
gen x0 = _n
gen x = runiform()
sort x
replace y = . in 1
replace y = . in 5
replace y = . in 10
replace y = . in 15
replace y = . in 20
replace y = . in 25
replace y = . in 30
replace y = . in 35
replace y = . in 40
replace y = . in 45
replace y = . in 50
sort y
replace x = . in 1
replace x = . in 5
replace x = . in 10
replace x = . in 15
replace x = . in 20
replace x = . in 25
replace x = . in 30
replace x = . in 35
replace x = . in 40
replace x = . in 45
replace x = . in 50
gen a = _n 
sort x
gen b = _n 
gen c = _n 

* Part 2

mi set mlong
mi register imputed y x

mi impute chained (regress) y x = a b c, add(10)
mi estimate, dots: regress y x