I have a Russian language dataset in .dta
format. Stata displays labels in a wrong way as a bunch of symbols. Seems that the issue is that when the file was created it was encoded as Windows-1251 and Stata uses different encoding to display it.
Please let me know if you have some ideas.
I tried to solve it running
clear all
set more off
unicode encoding set Windows-1251
unicode translate file_name.dta
And obtain the following r(198) error:
(using Windows-1251 encoding)
File summary (before starting): 1 file(s) specified 1 file(s) to be examined ... File file_name.dta (Stata dataset) 234 variable names okay, ASCII 1 variable name okay, already UTF-8 all data labels okay, ASCII 0 variable labels okay, ASCII 144 variable labels okay, already UTF-8 91 variable labels translated r(198);
if I try:
unicode analyze file_name.dta
I also get an r(3300)
error:
91 variable labels need translation 1 value-label name needs translation st_vlload(): 3300 argument out of range examine_dta_vallab_content(): - function returned error examine_dta_vallabs_content(): - function returned error examine_dta_file(): - function returned error examine_file(): - function returned error do_examine_files(): - function returned error unicode_do(): - function returned error unicode_analyze(): - function returned error <istmt>: - function returned error r(3300);