0

I've got a task to change encoding of .txt file to Windows-1251, OEM866 and UTF-8 using only cmd recently. I've tried using:

  1. chcp 866
  2. cmd /u /c /d type 1.txt > 866.txt But the text file had UTF-16 encoding, despite looking like a OEM866 text.
  • 4
    `cmd /?` says `/U` _Causes the output of internal commands to a pipe or file to be Unicode_ – JosefZ Oct 16 '21 at 14:24
  • 1
    Using only cmd.exe will not get you there. Even chcp.com is a separate program. Look for a Windows implementation of iconv. – lit Oct 16 '21 at 17:25

2 Answers2

0

If you wish to stick to cmd then you may need the old 2004 iconv transcoder tools so here is a download.cmd to get iconv.exe conversion and the support files. however read Force encode from US-ASCII to UTF-8 (iconv) for any relevant advice as its easy to use the wrong input to transcode from.

@echo off & Title Get-iConv

Rem Download libiconv-1.9.1 and support files on Windows 10 optionally include gettext-tools

set "iconv-dir=c:\text-iconv"
if not exist "%iconv-dir%" md "%iconv-dir%"
cd /d "%iconv-dir%"

if not exist gt-runtime.woe32.zip curl -o gt-runtime.woe32.zip http://ftp.gnu.org/gnu/gettext/gettext-runtime-0.13.1.bin.woe32.zip
tar -xf  gt-runtime.woe32.zip bin 
tar -xf  gt-runtime.woe32.zip share/doc

Rem if not exist gt-tools.woe32.zip curl -o gt-tools.woe32.zip http://ftp.gnu.org/gnu/gettext/gettext-tools-0.13.1.bin.woe32.zip
Rem tar -xf  gt-tools.woe32.zip bin
Rem tar -xf  gt-tools.woe32.zip share/doc

if not exist libiconv.woe32.zip curl -o libiconv.woe32.zip http://ftp.gnu.org/gnu/libiconv/libiconv-1.9.1.bin.woe32.zip
tar -xf  libiconv.woe32.zip bin
tar -xf  libiconv.woe32.zip share/doc

cd bin
start "" cmd /k "%iconv-dir%\bin\iconv.exe" -h
start "" "%iconv-dir%\share\doc\libiconv\iconv.1.html"
start "" https://stackoverflow.com/questions/11303405/force-encode-from-us-ascii-to-utf-8-iconv
K J
  • 8,045
  • 3
  • 14
  • 36
0

I'd say that the task (convert encoding of given files from one encoding to another like iconv tool does) is solvable using only cmd: first, create two auxiliary binary files bomUtf16le.bin and bomUtf8.bin as follows:

REM do dot run as a batch file; copy&paste the code into an open cmd window

:: create a testing folder and change the current directory
2>NUL md .\SO\69595742
pushd    .\SO\69595742

:: create file bomUtf16le.bin (BOM, encoding utf16LE)
>NUL chcp 1252
<nul set /p x=ÿþ>bomUtf16le.bin
:: create file bomUtf8.bin    (BOM, encoding utf8)
>NUL chcp 1252
<nul set /p x=>bomUtf8.bin

:: create file a1200.txt (a Cyrillic text, encoding utf16LEbom)
>NUL copy /Y /B bomUtf16le.bin a1200.txt 
cmd /U /D /C "(echo русский текст&echo кирилловский шрифт)>>a1200.txt"

popd

Important: do dot run above code snippet from a batch file; copy&paste the code into an open cmd window!
The code creates an initial testing file a1200.txt (encoding utf16LEbom). We could begin with a file of any supported encoding 1251 or 866 or 65001(==Utf8bom) because below conversions are designed to work cyclically (proved by binary comparison using fc command, and manually confirmed by opening all files in notepad++). The following code snippet assumes initial testing file encoding utf16LEbom.

Then run the following (run as a batch file, or copy&paste the code into an open cmd window):

@ECHO OFF
SETLOCAL EnableExtensions

:: run as a batch file, or copy&paste the code into an open cmd window

2>NUL md .\SO\69595742
pushd    .\SO\69595742

:: convert file a1200.txt to cp1251
>NUL chcp 1251
type a1200.txt>x1251.txt

:: convert file a1200.txt to cp866
>NUL chcp 866
type a1200.txt>x866.txt

:: convert file a1200.txt to utf-8 BOM
>NUL copy /Y /B bomUtf8.bin x65001bom.txt
>NUL chcp 65001
type a1200.txt>>x65001Bom.txt

:: convert file x866.txt to file x1200.txt (encoding utf16LEbom)
>NUL copy /Y /B bomUtf16le.bin x1200.txt
>NUL chcp 866
cmd /U /D /C "type x866.txt>>x1200.txt"

:: Perform a binary comparison (FC: no differences encountered)
fc /B x1200.txt a1200.txt

:: convert file x1251.txt to file y1200.txt (encoding utf16LEbom)
:: analogous to: x866.txt to file x1200.txt
>NUL copy /Y /B bomUtf16le.bin y1200.txt
>NUL chcp 1251
cmd /U /D /C "type x1251.txt>>y1200.txt"

:: Perform a binary comparison (FC: no differences encountered)
fc /B y1200.txt a1200.txt

:: convert file x65001bom.txt to file z1200.txt (encoding utf16LEbom)
>NUL chcp 65001
cmd /U /D /C "type x65001bom.txt>z1200.txt"

:: Perform a binary comparison (FC: no differences encountered)
fc /B z1200.txt a1200.txt

:: convert file a1200.txt to x65001noBom.txt (utf-8 no BOM, merely for completeness)
>NUL chcp 65001
type a1200.txt>x65001noBom.txt

dir *.txt | findstr /I "\.txt$"

popd

goto :eof

Result: .\SO\69595742.bat

Comparing files x1200.txt and A1200.TXT
FC: no differences encountered

Comparing files y1200.txt and A1200.TXT
FC: no differences encountered

Comparing files z1200.txt and A1200.TXT
FC: no differences encountered

17/10/2021  19:24                72 a1200.txt
17/10/2021  21:49                72 x1200.txt
17/10/2021  21:49                35 x1251.txt
17/10/2021  21:49                67 x65001Bom.txt
17/10/2021  21:49                64 x65001noBom.txt
17/10/2021  21:49                35 x866.txt
17/10/2021  21:49                72 y1200.txt
17/10/2021  21:49                72 z1200.txt

Summary (incomplete): file conversions (⇆ reversible)

Direct:

  • utf-16-le-bomcp866
  • utf-16-le-bomcp1251
  • utf-16-le-bomutf-8-bom
  • utf-16-le-bomutf-8-noBom

Possible (thru an auxiliary file):

  • cp866utf-16-le-bomcp1251
  • cp866utf-16-le-bomutf-8-bom
  • utf-8-bomutf-16-le-bomcp1251

Possible utf-8-noBomutf-8-bom as follows:

copy /B bomUtf8.bin + fileutf-8-noBom.txt fileutf-8-bom.txt

Tested in Windows 10 with the following Administrative language settings; not tested with that Beta checkbox unticked: Administrative language settings

JosefZ
  • 28,460
  • 5
  • 44
  • 83