I've got a task to change encoding of .txt file to Windows-1251, OEM866 and UTF-8 using only cmd recently. I've tried using:
- chcp 866
- cmd /u /c /d type 1.txt > 866.txt But the text file had UTF-16 encoding, despite looking like a OEM866 text.
I've got a task to change encoding of .txt file to Windows-1251, OEM866 and UTF-8 using only cmd recently. I've tried using:
If you wish to stick to cmd then you may need the old 2004 iconv transcoder tools so here is a download.cmd to get iconv.exe conversion and the support files. however read Force encode from US-ASCII to UTF-8 (iconv) for any relevant advice as its easy to use the wrong input to transcode from.
@echo off & Title Get-iConv
Rem Download libiconv-1.9.1 and support files on Windows 10 optionally include gettext-tools
set "iconv-dir=c:\text-iconv"
if not exist "%iconv-dir%" md "%iconv-dir%"
cd /d "%iconv-dir%"
if not exist gt-runtime.woe32.zip curl -o gt-runtime.woe32.zip http://ftp.gnu.org/gnu/gettext/gettext-runtime-0.13.1.bin.woe32.zip
tar -xf gt-runtime.woe32.zip bin
tar -xf gt-runtime.woe32.zip share/doc
Rem if not exist gt-tools.woe32.zip curl -o gt-tools.woe32.zip http://ftp.gnu.org/gnu/gettext/gettext-tools-0.13.1.bin.woe32.zip
Rem tar -xf gt-tools.woe32.zip bin
Rem tar -xf gt-tools.woe32.zip share/doc
if not exist libiconv.woe32.zip curl -o libiconv.woe32.zip http://ftp.gnu.org/gnu/libiconv/libiconv-1.9.1.bin.woe32.zip
tar -xf libiconv.woe32.zip bin
tar -xf libiconv.woe32.zip share/doc
cd bin
start "" cmd /k "%iconv-dir%\bin\iconv.exe" -h
start "" "%iconv-dir%\share\doc\libiconv\iconv.1.html"
start "" https://stackoverflow.com/questions/11303405/force-encode-from-us-ascii-to-utf-8-iconv
I'd say that the task (convert encoding of given files from one encoding to another like iconv
tool does) is solvable using only cmd
: first, create two auxiliary binary files bomUtf16le.bin
and bomUtf8.bin
as follows:
REM do dot run as a batch file; copy&paste the code into an open cmd window
:: create a testing folder and change the current directory
2>NUL md .\SO\69595742
pushd .\SO\69595742
:: create file bomUtf16le.bin (BOM, encoding utf16LE)
>NUL chcp 1252
<nul set /p x=ÿþ>bomUtf16le.bin
:: create file bomUtf8.bin (BOM, encoding utf8)
>NUL chcp 1252
<nul set /p x=>bomUtf8.bin
:: create file a1200.txt (a Cyrillic text, encoding utf16LEbom)
>NUL copy /Y /B bomUtf16le.bin a1200.txt
cmd /U /D /C "(echo русский текст&echo кирилловский шрифт)>>a1200.txt"
popd
Important: do dot run above code snippet from a batch file; copy&paste the code into an open cmd
window!
The code creates an initial testing file a1200.txt
(encoding utf16LEbom
). We could begin with a file of any supported encoding 1251
or 866
or 65001
(==Utf8bom
) because below conversions are designed to work cyclically (proved by binary comparison using fc
command, and manually confirmed by opening all files in notepad++
). The following code snippet assumes initial testing file encoding utf16LEbom
.
Then run the following (run as a batch file, or copy&paste the code into an open cmd
window):
@ECHO OFF
SETLOCAL EnableExtensions
:: run as a batch file, or copy&paste the code into an open cmd window
2>NUL md .\SO\69595742
pushd .\SO\69595742
:: convert file a1200.txt to cp1251
>NUL chcp 1251
type a1200.txt>x1251.txt
:: convert file a1200.txt to cp866
>NUL chcp 866
type a1200.txt>x866.txt
:: convert file a1200.txt to utf-8 BOM
>NUL copy /Y /B bomUtf8.bin x65001bom.txt
>NUL chcp 65001
type a1200.txt>>x65001Bom.txt
:: convert file x866.txt to file x1200.txt (encoding utf16LEbom)
>NUL copy /Y /B bomUtf16le.bin x1200.txt
>NUL chcp 866
cmd /U /D /C "type x866.txt>>x1200.txt"
:: Perform a binary comparison (FC: no differences encountered)
fc /B x1200.txt a1200.txt
:: convert file x1251.txt to file y1200.txt (encoding utf16LEbom)
:: analogous to: x866.txt to file x1200.txt
>NUL copy /Y /B bomUtf16le.bin y1200.txt
>NUL chcp 1251
cmd /U /D /C "type x1251.txt>>y1200.txt"
:: Perform a binary comparison (FC: no differences encountered)
fc /B y1200.txt a1200.txt
:: convert file x65001bom.txt to file z1200.txt (encoding utf16LEbom)
>NUL chcp 65001
cmd /U /D /C "type x65001bom.txt>z1200.txt"
:: Perform a binary comparison (FC: no differences encountered)
fc /B z1200.txt a1200.txt
:: convert file a1200.txt to x65001noBom.txt (utf-8 no BOM, merely for completeness)
>NUL chcp 65001
type a1200.txt>x65001noBom.txt
dir *.txt | findstr /I "\.txt$"
popd
goto :eof
Result: .\SO\69595742.bat
Comparing files x1200.txt and A1200.TXT
FC: no differences encountered
Comparing files y1200.txt and A1200.TXT
FC: no differences encountered
Comparing files z1200.txt and A1200.TXT
FC: no differences encountered
17/10/2021 19:24 72 a1200.txt
17/10/2021 21:49 72 x1200.txt
17/10/2021 21:49 35 x1251.txt
17/10/2021 21:49 67 x65001Bom.txt
17/10/2021 21:49 64 x65001noBom.txt
17/10/2021 21:49 35 x866.txt
17/10/2021 21:49 72 y1200.txt
17/10/2021 21:49 72 z1200.txt
Summary (incomplete): file conversions (⇆ reversible)
Direct:
utf-16-le-bom
⇆ cp866
utf-16-le-bom
⇆ cp1251
utf-16-le-bom
⇆ utf-8-bom
utf-16-le-bom
→ utf-8-noBom
Possible (thru an auxiliary file):
cp866
⇆ utf-16-le-bom
⇆ cp1251
cp866
⇆ utf-16-le-bom
⇆ utf-8-bom
utf-8-bom
⇆ utf-16-le-bom
⇆ cp1251
Possible utf-8-noBom
→ utf-8-bom
as follows:
copy /B bomUtf8.bin + fileutf-8-noBom.txt fileutf-8-bom.txt
Tested in Windows 10 with the following Administrative language settings; not tested with that Beta checkbox unticked: