17

The function "Convert to UTF-8 without BOM" of Notepad++ is really nice. But I have 200 files and all of them need to be coverted. Therefor I found this little python script:

import os;
import sys;
filePathSrc="C:\\Temp\\UTF8"
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
      if fn[-4:] != '.jar' and fn[-5:] != '.ear' and fn[-4:] != '.gif' and fn[-4:] != '.jpg' and fn[-5:] != '.jpeg' and fn[-4:] != '.xls' and fn[-4:] != '.GIF' and fn[-4:] != '.JPG' and fn[-5:] != '.JPEG' and fn[-4:] != '.XLS' and fn[-4:] != '.PNG' and fn[-4:] != '.png' and fn[-4:] != '.cab' and fn[-4:] != '.CAB' and fn[-4:] != '.ico':
        notepad.open(root + "\\" + fn)
        console.write(root + "\\" + fn + "\r\n")
        notepad.runMenuCommand("Encoding", "Convert to UTF-8 without BOM")
        notepad.save()
        notepad.close()

It goes through every file -> I can see this. But after it finished, the charset is stil ANSI in my case :/

Can anyone help me?

Phil
  • 943
  • 2
  • 6
  • 18
  • 1
    Are there any error messages? You run this into the "notepad++ Python Script plugin"? Maybe you can check if there really is a "Convert to UTF-8 without BOM" in the Encoding menu. In my notepad++ there is only a "Convert to UTF-8" . It could be worth changing the string. – Lars Fischer Feb 20 '16 at 17:33
  • Right, I use this plugin. And in my notepad there is "Convert to UTF-8 without BOM" and "Covert to UTF-8" - so both. – Phil Feb 20 '16 at 17:40

4 Answers4

22

Here is what worked for me:

Go to Notepad++ -> Plugins -> Plugins Admin.

Find and install Python Script plugin.

Create new python script with Plugins -> Python Script -> New script.

Insert this code into your script:

import os;
import sys;
filePathSrc="C:\\Users\\YourUsername\\Desktop\\txtFolder"
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
      if fn[-4:] == '.txt' or fn[-4:] == '.csv':
        notepad.open(root + "\\" + fn)
        console.write(root + "\\" + fn + "\r\n")
        notepad.runMenuCommand("Encoding", "Convert to UTF-8")
        notepad.save()
        notepad.close()

Replace C:\\Users\\YourUsername\\Desktop\\txtFolder with path to your Windows folder where your files are.

Script works with .txt and .csv files and ignores all other files in folder.

Run script with Plugins -> Python Scripts -> Scripts -> name of your script

Hrvoje
  • 13,566
  • 7
  • 90
  • 104
  • 1
    Are you sure the second condition ``fn[-5:] `` is correct? I believe it should be ``fn[-4:]`` too, because ``.csv`` is the same length as ``.txt``. Besides I would recommend to use the ``endswith`` method if possible. – turbolocust Jun 15 '20 at 17:39
  • @turbolocust right you are my friend. changed accordingly. – Hrvoje Jun 17 '20 at 07:34
  • 1
    This deserves an upvote as I tried multiple things on windows and this is the only one that actually worked. Thank you – tim Jul 13 '21 at 16:11
  • 1
    I had problems with some of the files having non-ascii characters in their name. So I had to change "opening" part of the script. filePath = `os.path.join(root,fn).decode(sys.getfilesystemencoding()).encode('utf8')` `notepad.open(filePath)` `console.write(filePath + "\r\n")` Otherwise Notepad++ promoted me to save file as a new copy with messed up filename for filenames containing non-ascii characters. – Almighty May 16 '22 at 07:32
  • Source: https://gist.github.com/bjverde/a6e822c91b0826ce05930f0f9aaec61c – mesompi Oct 10 '22 at 19:19
8

Got my mistake. My notepad is in german. So take care if it's called "Encoding" or in my case "Kodierung" and "Convert to UTF-8 without BOM" is "Konvertiere zu UTF-8 ohne BOM"

That helped me out!

Phil
  • 943
  • 2
  • 6
  • 18
5

You also can record and play back a macro here. Tthat's what worked for me since the PlugIn manager is somehow broken I don't have Python available.

  • drag a set of files (or all - I think there is a limit in the maximum number of files) into notepad++
  • Macro -> Start recording
  • do the conversion
  • save file
  • close file
  • Macro -> Stop recording

You can play back the macro by selecting

  • Macro -> Run a Macro Multiple Times
  • Enter a value such that all files are processed

Since the files are closed after processing, you will know which files have not been processed yet.

klaus
  • 51
  • 1
  • 1
  • This is it! Much simpler than fiddling with some python script. – Armin Bu Jul 26 '22 at 16:33
  • I don't think this is working, its not changing the file. If I record just the Encoding->UTF-8 step, and stop the recording, it doesn't offer me a playback option – user433342 Aug 02 '23 at 20:08
0

USE NOTEPAD++ Python SCript Plugin. Copy this code into a NEW SCRIPT:

# -*- coding: utf-8 -*-
from __future__ import print_function

from Npp import notepad
import os

uft8_bom = bytearray(b'\xEF\xBB\xBF')
top_level_dir = notepad.prompt('Paste path to top-level folder to process:', '', '')
if top_level_dir != None and len(top_level_dir) > 0:
    if not os.path.isdir(top_level_dir):
        print('bad input for top-level folder')
    else:
        for (root, dirs, files) in os.walk(top_level_dir):
            for file in files:
                full_path = os.path.join(root, file)
                print(full_path)
                with open(full_path, 'rb') as f: data = f.read()
                if len(data) > 0:
                    if ord(data[0]) != uft8_bom[0]:
                        try:
                            with open(full_path, 'wb') as f: f.write(uft8_bom + data)
                            print('added BOM:', full_path)
                        except IOError:
                            print("can't change - probably read-only?:", full_path)
                    else:
                        print('already has BOM:', full_path)

SECOND SOLUTION IS TO USE REGEX, find and replace:

Find in files:
SEARCH: \A
REPLACE BY: \x{FEFF} FILTERS *.html (you have to give Ok from the first, don't cancel)

Just Me
  • 864
  • 2
  • 18
  • 28