How to get "for" to get correctly encoded text from a UTF-8 formatted file?

Question

When I use "for" to read each line from 1.txt in UTF-8 format, it will be garbled. How to get the batch to correctly recognize UTF-8 encoded files?

for /F "tokens=*" %%f in (1.txt) do echo %%f
pause

Then keep your fingers crossed for font used for console window supporting the Unicode encoded characters, see [Using another language (code page) in a batch file made for others](https://stackoverflow.com/a/48982681/3074564). Further I recommend to use `delims=` (turns off line splitting behavior) instead of `tokens=*` (line splitting is done resulting in removing leading spaces/tabs) and use not `f` as loop variable although possible, but for example `L` or `I` or `#` which are characters not used for modifiers explained by the help output on running `for /?` in a cmd window. — Mofi, Nov 07 '19 at 07:27
BTW: Is there any reason not using command `type`? Get help on this command with running in a cmd window `type /?`. — Mofi, Nov 07 '19 at 07:29
In addition to adding chcp 65001, you must also set the CMD font, otherwise it will prompt The system cannot write to the specified device. — user69485, Nov 07 '19 at 09:35
In the actual batch, I need to use for to read each line from the file as a parameter to another cli. So type does not apply. — user69485, Nov 07 '19 at 09:39

score 0 · Answer 1 · edited Nov 07 '19 at 08:29

0

Use this:

for /F "tokens=* delims= " %%f in ('type 1.txt') do echo %%f

This will really work because type command reads lines from a text file, no matter what encoding is it.

edited Nov 07 '19 at 08:29

Biffen

answered Nov 07 '19 at 08:27

Wasif

This does not work if the file `1.txt` contains non-ASCII characters UTF-8 encoded which should be displayed correct in console window. If the file `1.txt` contains only ASCII characters, there would be binary no difference between OEM/ANSI/ASCII encoded text file and UTF-8 encoded text file without byte order mark (BOM). So the usage of `type` is irrelevant for the issue getting non-ASCII characters not correct displayed in console window. – Mofi Nov 07 '19 at 09:01
Also if that would work, it would be enough to use just `type 1.txt` instead of using command `for` to start a new command process in background with `%ComSpec% /c type 1.txt`, output to handle __STDOUT__ with `type` the text with conversion from UTF-8 to OEM code page according to configured country for the used account, capture that output by `cmd.exe` executing `for`, remove from all non-empty lines and lines not starting with a semicolon ignored by `for` all leading spaces and output the remaining line in console window of command process which is processing the batch file. – Mofi Nov 07 '19 at 09:06

1 Answers1