21

Does anybody know a tool, preferably for the Explorer context menu, to recursively change the encoding of files in a project from ISO-8859-1 to UTF-8 and other encodings? Freeware or not too expensive would be great.

Edit: Thanks for the answers, +1 for all of then. But I would really like to be able to just right click a folder and say "convert all .php files to UTF-8". :) Further suggestions are appreciated, starting a bounty.

thomaux
  • 19,133
  • 10
  • 76
  • 103
Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • I need this, too, for a bunch of GB2312 files. A utility that translates from Chinese to English recursively would be even better... :) – endolith Feb 23 '12 at 15:03

5 Answers5

39

You could easily achieve something like this using Windows PowerShell. If you got the content for a file you could pipe this to the Out-File cmdlet specifying UTF8 as the encoding.

Try something like:

Get-ChildItem *.txt -Recurse | ForEach-Object {
$content = $_ | Get-Content

Set-Content -PassThru $_.Fullname $content -Encoding UTF8 -Force}  
dstandish
  • 2,328
  • 18
  • 34
5

I don't know about from the context menu, but notepad++ allows you to change file encodings and it has a macro option... so you could automate the process

Mark
  • 5,423
  • 11
  • 47
  • 62
  • 1
    I'm trying to do exactly that, but for some insane reason, stuff you do in the Encoding menu doesn't get saved in macros! – Nathan Stretch May 31 '10 at 23:51
3

If you import a test.reg file having the following contain

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\Directory\shell\ConvertPHP]
@="convert all .php files to UTF-8"

[HKEY_CLASSES_ROOT\Directory\shell\ConvertPHP\command]
@="cmd.exe /c C:\\TEMP\\t.cmd php \"%1\""

After this you will receive the menu item "convert all .php files to UTF-8" in the context menu of explorer on every directory. After the choosing of the item the batch program C:\TEMP\t.cmd will be started with "php" string as the first parameter and the quoted directory name as the second parameter (of cause the first parameter "php" you can skip if it is not needed). The file t.cmd like

echo %1>C:\TEMP\t.txt
echo %2>>C:\TEMP\t.txt

can be used to prove that all this work.

So you can decode the *.php files with any tool which you prefer. For example you can use Windows PowerShell (see the answer of Alan).

If you want that the extension like PHP will be asked additionally you can write a small program which display the corresponding input dialog and then start the Windows PowerShell script.

Oleg
  • 220,925
  • 34
  • 403
  • 798
2

Here's a nice ASP recursive converter, you need IIS running on your computer:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<HTML>
<HEAD>
<TITLE>Charset Converter - TFI 13/02/2015</TITLE>
</HEAD>
<BODY style='font-family:arial;font-size:11px;color:white;background-color:#7790c4;font-size:15px'>
<H1 style='color:yellow'>Recursive file charset converter</H1>
by TFI 13/02/2015<BR><BR>
<%
totalconverted=0

Function transcoder( ANSIFile)
   UFT8FileOut=ANSIFile&".tempfile" 
   Set oFS    = CreateObject( "Scripting.FileSystemObject" )
   Set oFrom  = CreateObject( "ADODB.Stream" )
   sFFSpec    = oFS.GetAbsolutePathName(ANSIFile)
   Set oTo    = CreateObject( "ADODB.Stream" )
   sTFSpec    = oFS.GetAbsolutePathName(UFT8FileOut)
   oFrom.Type    = 2 'adTypeText
   oFrom.Charset = fromchar '"Windows-1252"
   oFrom.Open
   oFrom.LoadFromFile sFFSpec
   oTo.Type    = 2 'adTypeText
   oTo.Charset = tochar '"utf-8"
   oTo.Open
   oTo.WriteText oFrom.ReadText
   oTo.SaveToFile sTFSpec,2
   oFrom.Close
   oTo.Close
   oFS.DeleteFile sFFSpec
   oFS.MoveFile sTFSpec,sFFSpec
End Function

Function ConvertFiles(objFolder, sExt, bRecursive, fromchar, tochar)
    Dim objFile, objSubFolder
    For each objFile in objFolder.Files
        If Ucase(fso.GetExtensionName(objFile)) = ucase(sExt) Then
            transcoder objFile.path
            totalconverted=totalconverted+1
            response.write "&bull; Converted <B>"&fso.GetAbsolutePathName(objFile)&"</B> from <B>"&fromchar&"</B> to <B>"&tochar&"</B><BR>"
        End If
    Next

    If bRecursive = true then
        For each objSubFolder in objFolder.Subfolders
            ConvertFiles objSubFolder, sExt, true, fromchar, tochar
        Next
    End If
End Function

sFolder=request.form("sFolder")
sExtension=request.form("sExtension")
fromchar=request.form("fromchar")
tochar=request.form("tochar")
sSubs=request.form("sSubs")
if sSubs="1" then
    sub1=True
else
    sub1=false
end if  

if len(sExtension)=0 then sExtension="asp"
if len(sFolder)>0 and len(fromchar)>0 and len(tochar)>0 then

    Dim fso, folder, files, NewsFile, sFolder, objFSO, strFileIn, strFileOut
    Set fso = CreateObject("Scripting.FileSystemObject")
    'sFolder = "C:\inetpub\wwwroot\naoutf8"
    ConvertFiles fso.GetFolder(sFolder), sExtension, Sub1, fromchar, tochar
    response.write "<hr><br>Total files converted: "&totalconverted&"<BR><BR>New conversion?<br><br>"
end if
%>  
<FORM name=ndata method=post action="UTF8converter.asp">
<TABLE cellspacing=0 cellpadding=5>
<TR>
    <TD>Folder to process:</TD>
    <TD><INPUT name=sFolder style='width:350px' placeholder="C:\example"></TD>
</TR>   
<TR>
    <TD>Extension:</TD>
    <TD><INPUT name=sExtension style='width:50px' value='asp'> (default is .asp)</TD>
</TR>
<TR>
    <TD>Process subfolders:</TD>
    <TD><INPUT type=checkbox name=sSubs value='1' checked></TD>
</TR>
<TR>
    <TD>From charset:</TD>
    <TD><select name=fromchar>
    <option value="big5">charset=big5 - Chinese Traditional (Big5)
    <option value="euc-kr">charset=euc-kr - Korean (EUC)
    <option value="iso-8859-1">iso-8859-1 - Western Alphabet
    <option value="iso-8859-2">iso-8859-2 - Central European Alphabet (ISO)
    <option value="iso-8859-3">iso-8859-3 - Latin 3 Alphabet (ISO)
    <option value="iso-8859-4">iso-8859-4 - Baltic Alphabet (ISO)
    <option value="iso-8859-5">iso-8859-5 - Cyrillic Alphabet (ISO)
    <option value="iso-8859-6">iso-8859-6 - Arabic Alphabet (ISO)
    <option value="iso-8859-7">iso-8859-7 - Greek Alphabet (ISO)
    <option value="iso-8859-8">iso-8859-8 - Hebrew Alphabet (ISO)
    <option value="koi8-r">koi8-r - Cyrillic Alphabet (KOI8-R)
    <option value="shift-jis">shift-jis - Japanese (Shift-JIS)
    <option value="x-euc">x-euc - Japanese (EUC)
    <option value="utf-8">utf-8 - Universal Alphabet (UTF-8)
    <option value="windows-1250">windows-1250 - Central European Alphabet (Windows)
    <option value="windows-1251">windows-1251 - Cyrillic Alphabet (Windows)
    <option value="windows-1252" selected>windows-1252 - Western Alphabet (Windows)
    <option value="windows-1253">windows-1253 - Greek Alphabet (Windows)
    <option value="windows-1254">windows-1254 - Turkish Alphabet
    <option value="windows-1255">windows-1255 - Hebrew Alphabet (Windows)
    <option value="windows-1256">windows-1256 - Arabic Alphabet (Windows)
    <option value="windows-1257">windows-1257 - Baltic Alphabet (Windows)
    <option value="windows-1258">windows-1258 - Vietnamese Alphabet (Windows)
    <option value="windows-874">windows-874 - Thai (Windows)
    </select></TD>
</TR>
<TR>
    <TD>To charset:</TD>
    <TD><select name=tochar>
    <option value="big5">big5 - Chinese Traditional (Big5)
    <option value="euc-kr">euc-kr - Korean (EUC)
    <option value="iso-8859-1">iso-8859-1 - Western Alphabet
    <option value="iso-8859-2">iso-8859-2 - Central European Alphabet (ISO)
    <option value="iso-8859-3">iso-8859-3 - Latin 3 Alphabet (ISO)
    <option value="iso-8859-4">iso-8859-4 - Baltic Alphabet (ISO)
    <option value="iso-8859-5">iso-8859-5 - Cyrillic Alphabet (ISO)
    <option value="iso-8859-6">iso-8859-6 - Arabic Alphabet (ISO)
    <option value="iso-8859-7">iso-8859-7 - Greek Alphabet (ISO)
    <option value="iso-8859-8">iso-8859-8 - Hebrew Alphabet (ISO)
    <option value="koi8-r">koi8-r - Cyrillic Alphabet (KOI8-R)
    <option value="shift-jis">shift-jis - Japanese (Shift-JIS)
    <option value="x-euc">x-euc - Japanese (EUC)
    <option value="utf-8" selected>utf-8 - Universal Alphabet (UTF-8)
    <option value="windows-1250">windows-1250 - Central European Alphabet (Windows)
    <option value="windows-1251">windows-1251 - Cyrillic Alphabet (Windows)
    <option value="windows-1252">windows-1252 - Western Alphabet (Windows)
    <option value="windows-1253">windows-1253 - Greek Alphabet (Windows)
    <option value="windows-1254">windows-1254 - Turkish Alphabet
    <option value="windows-1255">windows-1255 - Hebrew Alphabet (Windows)
    <option value="windows-1256">windows-1256 - Arabic Alphabet (Windows)
    <option value="windows-1257">windows-1257 - Baltic Alphabet (Windows)
    <option value="windows-1258">windows-1258 - Vietnamese Alphabet (Windows)
    <option value="windows-874">windows-874 - Thai (Windows)
    </select></TD>
</TR>
</TABLE><BR>
    <INPUT TYPE=BUTTON onClick='if(document.ndata.sFolder.value.length>0)document.ndata.submit()'value='Convert folder and subfolders'>
</FORM> 
</BODY>
</HTML>
Niente3
  • 21
  • 1
1

I know this answer is late-coming, but here are two commandline apps to convert encoding. Just make a batch-file wrapper for one, and add it to your * key in the registry.

http://www.autohotkey.com/forum/topic10796.html

http://www.gbordier.com/gbtools/stringconverter.htm

I used the stringconvertor by adding it as a button in my file-manager, FreeCommanderXE. It only converts one file at a time, but I can click on one, and push the convert button, then click on the next.

bgmCoder
  • 6,205
  • 8
  • 58
  • 105