17

I've got some UTF-8 files created in Mac, and when trying to open them using TextPad in Windows, I get the following warning:

WARNING: (file name) contains characters that do not exist in code page 1252 (ANSI Latin 1). They will be converted to the system default character, if you click OK.

Linux (GNOME gEdit) can open the same file without complaints. What does the above mean? I thought that TextPad had full UTF-8 support. Can I safely open and edit UTF-8 files using it without corrupting the file?

PaulJ
  • 1,646
  • 5
  • 33
  • 52
  • Always had the same problem, too. TextPad is awesome, but it sucks when it comes to character encoding. What I do to circumvent the problem is to put all the icon definition lines of my css files into a separate css file. I then edit this file with Notepad. – reggie Jun 07 '13 at 08:35
  • 4
    TextPad 8 is here with BMP Unicode support (see http://stackoverflow.com/a/35076216/8946) – Lawrence Dol Jan 29 '16 at 20:57

7 Answers7

12

It seems that TextPad cannot handle characters outside windows-1252 (CP1252, here carrying the misnomer “ANSI Latin 1”). I tested it on Windows, opening a plain text file created on the same system, as UTF-8 encoded, both with and without BOM, with the same result. The program’s help does not seem to contain anything related to character encodings, and its tools for writing “international characters” are for Latin-1 characters only.

There are several text editors for Windows that can deal with UTF-8 (even Notepad can open a UTF-8 file, but it can hardly be recommended for serious editing). See Alan Wood’s collection of information on Unicode editors and word processors for Windows. (Personally, I like Notepad++ and BabelPad, which are both free.)

Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • The specification for TextPad specifically says this: "16-bit Unicode, UTF-8 and 8-bit text files with single and double byte characters can be edited." – David Heffernan Jan 16 '12 at 12:24
  • 3
    TextPad Help says something confused about encodings, but setting Encoding to UTF-8 in the Open dialog does not help. Neither does it help to set, in Configure/Preferences, the default encoding to UTF-8: the data is still flattened to windows-1252 (i.e., characters outside it are mapped to windows-1252 characters or question marks or something else). They say “This means that it is only possible to edit, without data loss, files containing characters from the implied code page.” (TextPad Help, keyword “unicode”) – Jukka K. Korpela Jan 16 '12 at 13:44
  • 1
    Very odd. Can't understand paying for a product like that when Notepad++ exists! – David Heffernan Jan 16 '12 at 13:48
  • 1
    @DavidHeffernan: The TextPad specification "forgot" to mention the limitations pointed out by Jukka K. Korpela / bobince. "_Unicode, UTF-8 & 8-bit text files with single and double byte characters can be edited._" — what they really mean: "**A small subset of** …" As a paying customer, I complained about the lack of real UTF-8 support (despite **claims** of support) years ago, when they were back on major versions 4 and 5; and tried all possible configurations using all available advice! TextPad staff responded to other support enquiries, but on this issue they have been suspiciously silent! – Matthew Slyman Oct 14 '15 at 12:09
  • 1
    It is sad. I have been a big fan, and got two different organizations to license the entire team in TextPad. However, my new assignment requires unicode and a single byte editor no longer works. Eclipse edits unicode fine. The claim to edit "16-bit Unicode, UTF-8... " is in my opinion bordering on fraud: it will read file with those encodings, but only to translate to an internal 8-bit editor. You can not edit Japanese characters - not with any amount of fiddling. TextPad has been a great tool all these years, but it is time to move on...... – AgilePro Jan 04 '16 at 03:25
9

TextPad 8, the newest as of 2016-01-28, does finally properly support BMP Unicode. It's a paid upgrade, but so far has been working flawlessly for me.

Lawrence Dol
  • 63,018
  • 25
  • 139
  • 189
  • 1
    Yes, but ... even txt class IS configured to convert always to utf8, file command returns utf8 , notepad++ opens correctly the file , unicode font is selected for the txt class it converts the ä 's and ö's into broken chars by assuming ANSI, but if one inserts non-ansi code page chars such as Cyrillic chars than it assumes correctly utf8. To me this is a bug , not a feature and yes NotePad++ handles is correctly – Yordan Georgiev Jun 22 '17 at 06:41
6

TextPad ‘supports’ UTF-8 and UTF-16 documents only in as much as it will import and export them. But it still edits files as simple bytes, and not Unicode characters (using the ANSI code page, which is code page 1252 for Western European).

So unless the file happened to contain only characters that also exist in that code page, you will lose content. This rather defeats the point of Unicode.

Indeed, this was the issue that made me flee—to EmEditor, at the time, though now I would agree with the previous comments and recommend Notepad++. The era of paying for text editors is long gone.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • 1
    If I can just say one more thing on stackoverflow for the rest of my life, then it would be to try emeditor. Superb. – sksamuel May 28 '12 at 12:39
5

Actually TextPad does support displaying Unicode code points granted they went about it the wrong way. In order to display the Unicode characters you have to choose Configure->Preferences and expand "Document Classes->Text->Font.

You need to choose a Unicode font AND set the Script to match. E.g. Arial Unicode MS with script CHINESE_BIG5.

However, this is a backward approach since the application should handle this when the user tells TextPad to open the file in Unicode or UTF-8. The built in Notepad application with MS Windows will detect the encoding automatically and display the glyphs correctly based upon the encoding.

Warren Rox
  • 655
  • 2
  • 8
  • 23
  • 2
    Even Arial Unicode MS doesn't contain all the glyphs I want; and if I, a paying customer and computer science graduate, could not by any means figure out how to make a multilingual (even just pan-European) UTF-8 source-code file with TextPad, after reading their help files, forums etc. and after attempting to contact support; then there's something seriously wrong (I shouldn't say how much time I wasted struggling with corrupted UTF-8 SQL dumps because of TP!) If the TextPad people are going to claim unicode support for their product, they should at least put an asterisk next to that claim! – Matthew Slyman Oct 14 '15 at 12:36
3

I found a discussion on this in the Textpad forums: http://forums.textpad.com/viewtopic.php?t=11019

While I have Notepad++, Textpad handles large files with ease while other editors I've tried, including Notepad++, either slow to a crawl or die. I'm currently trying to edit a 475MB file and Notepad++ is not up to the task.

  • Large files: I think that's because of the text highlighting, which needs a lot more memory. Notepad++ can't fix this because this is a limitation of the Scintilla component which Notepad++ uses to diplay the text. – StanE May 10 '15 at 04:18
  • [EmEditor is specifically designed to handle large files gracefully](https://www.emeditor.com/text-editor-features/large-file-support/large-file-controller/). In my experience (e.g. with SQL files of 5–15 GB in size on a 64-bit Windows computer with 4GB of RAM), it does so admirably well. (Certain operations such as global find-and-replace are always going to be slow on any text editor in this situation, but EmEditor takes a practical approach to doing what's possible.) – Matthew Slyman Oct 14 '15 at 11:56
2

Textpad Configure Menu --> Preferences --> Document Classes --> Default --> Default encoding --> UTF-8

-2

Try the ANSI code set with File/Open, that should solve the problem in TextPad

Rinus
  • 13
  • 1