15

what is the best way to convert a UTF-16 files to UTF-8? I need to use this in a cmd script.

chills42
  • 14,201
  • 3
  • 42
  • 77
Grzenio
  • 35,875
  • 47
  • 158
  • 240

6 Answers6

25

There is a GNU tool recode which you can also use on Windows. E.g.

recode utf16..utf8 text.txt
Kaarel
  • 10,554
  • 4
  • 56
  • 78
  • 6
    A Windows version of 'recode' can be downloaded as part of the 'GNU utilities for Win32' package from sourceforge: http://downloads.sourceforge.net/unxutils/UnxUtils.zip?modtime=1172730504&big_mirror=0 – msanders Jan 08 '09 at 11:38
  • 2
    Short note - this also works on ubuntu linux with an apt-get install recode. Handy. – Danny Staple Nov 05 '12 at 11:42
15

An alternative to Ruby would be to write a small .NET program in C# (.NET 1.0 would be fine, although 2.0 would be simpler :) - it's a pretty trivial bit of code. Were you hoping to do it without any other applications at all? If you want a bit of code to do it, add a comment and I'll fill in the answer...

EDIT: Okay, this is without any kind of error checking, but...

using System;
using System.IO;
using System.Text;

class FileConverter
{
  static void Main(string[] args)
  {
    string inputFile = args[0];
    string outputFile = args[1];
    using (StreamReader reader = new StreamReader(inputFile, Encoding.Unicode))
    {
      using (StreamWriter writer = new StreamWriter(outputFile, false, Encoding.UTF8))
      {
        CopyContents(reader, writer);
      }
    }
  }

  static void CopyContents(TextReader input, TextWriter output)
  {
    char[] buffer = new char[8192];
    int len;
    while ((len = input.Read(buffer, 0, buffer.Length)) != 0)
    {
      output.Write(buffer, 0, len);
    }
  }
}
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I was hoping there is a utility I could just use :) I would be grateful for a bit of code, cheers. – Grzenio Nov 05 '08 at 16:15
  • FYI, here is a little utility based on that code: [https://github.com/paulroho/ConvertToUtf8](https://github.com/paulroho/ConvertToUtf8). – paulroho Mar 05 '16 at 22:33
10

Certainly, the easiest way is to load the script into notepad, then save it again with the UTF-8 encoding. It's an option in the Save As dialog box..

Tor Haugen
  • 19,509
  • 9
  • 45
  • 63
  • 3
    Cheers, I can use it as a workaround, but my script needs to do this conversion, I can't convert every file manually.... – Grzenio Nov 05 '08 at 15:03
  • 1
    Although it doesn't actually answer the question because it doesn't work in a script, it sure solved my problem! Thanks – davidreedernst Sep 18 '14 at 16:17
7

Perhaps with iconv?

PhiLho
  • 40,535
  • 6
  • 96
  • 134
7

You can do this easily with built-in PowerShell cmdlets, which you can invoke from cmd:

C:\> powershell -c "Get-Content mytext.txt | Set-Content -Encoding utf8 mytext_utf8.txt"

Edit: obviously if you're already in powershell, this would be simplified. Using aliases would also simplify things:

> gc mytext.txt | sc -Encoding utf8 mytext_utf8.txt
Ben Collins
  • 20,538
  • 18
  • 127
  • 187
1

If you have a ruby distribution installed, you can call a ruby script taking care of the conversion:

Ruby script to convert file(s) character encoding

In the same spirit: Perl script

In the absence of script support, you would have to code it like this C++ source using a WideCharToMultiByte() call...

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250