6

I have a problem. I found out that emacs recently stopped to save all my new files with the default character set "utf-8-unix". I do not understand what I did, but when I open a file, above the mini-buffer I see "--:---" instead of "-U:---", where the "U" says that the file is saved with utf-8-unix charset. How can I reset emacs to save files in the proper coding system???

tonix
  • 6,671
  • 13
  • 75
  • 136
  • have a look: http://ergoemacs.org/emacs/emacs_encoding_decoding_faq.html – Mitesh Pathak Dec 21 '13 at 20:32
  • I've tried all those commands, but still get the "undecided-unix" ("--:---") encoding instead of UTF-8... How can I resolve? – tonix Dec 21 '13 at 20:49
  • I do not want to add to all my files "-*- coding: utf-8 -*-" at their first line, please help me find out what went wrong! Thank you! – tonix Dec 21 '13 at 20:53
  • 1
    you can add # -STAR- coding: utf-8 -STAR- to the top of the file....will surely work...but also hav look on: http://stackoverflow.com/questions/1674481/how-to-configure-gnu-emacs-to-write-unix-or-dos-formatted-files-by-default – Mitesh Pathak Dec 21 '13 at 20:58
  • 1
    where replace STAR by asterisk – Mitesh Pathak Dec 21 '13 at 21:03
  • I don't know why even when I added this lines in my .emacs – tonix Dec 21 '13 at 21:06
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/43677/discussion-between-mitesh-pathak-and-user3019105) – Mitesh Pathak Dec 21 '13 at 21:10
  • 1
    @MiteshPathak: should be `; -*- coding: utf-8 -*-` (that is, use the Emacs comment mark, semicolon, at the start of the line, instead of the hash sign). – Teemu Leisti Oct 16 '14 at 09:33
  • `-*- coding: utf-8 -*-` works regardless of comments, e.g., use `;` for Lisp or Scheme, `#` for Bash or Python, `//` for C++ or C99, nothing for English or Français, etc. sorry @TeemuLeisti confused metadata and content. – Devon Nov 16 '16 at 11:11
  • @devon If you don't use the comment character `;` at the start of the initialization file, Emacs will choke on the line, and fail to do any user-specific initialization. – Teemu Leisti Nov 16 '16 at 11:43
  • @TeemuLeisti that has nothing to do with `-*-` and everything to do with emacs lisp. The OP did not specify emacs lisp, the files could be in other langugages with other comment syntax. – Devon Nov 16 '16 at 14:33

3 Answers3

10

Here is my setup:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ENCODING ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; C-h C RET
;; M-x describe-current-coding-system

(add-to-list 'file-coding-system-alist '("\\.tex" . utf-8-unix) )
(add-to-list 'file-coding-system-alist '("\\.txt" . utf-8-unix) )
(add-to-list 'file-coding-system-alist '("\\.el" . utf-8-unix) )
(add-to-list 'file-coding-system-alist '("\\.scratch" . utf-8-unix) )
(add-to-list 'file-coding-system-alist '("user_prefs" . utf-8-unix) )

(add-to-list 'process-coding-system-alist '("\\.txt" . utf-8-unix) )

(add-to-list 'network-coding-system-alist '("\\.txt" . utf-8-unix) )

(prefer-coding-system 'utf-8-unix)
(set-default-coding-systems 'utf-8-unix)
(set-terminal-coding-system 'utf-8-unix)
(set-keyboard-coding-system 'utf-8-unix)
(set-selection-coding-system 'utf-8-unix)
(setq-default buffer-file-coding-system 'utf-8-unix)

;; Treat clipboard input as UTF-8 string first; compound text next, etc.
(setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))

;; mnemonic for utf-8 is "U", which is defined in the mule.el
(setq eol-mnemonic-dos ":CRLF")
(setq eol-mnemonic-mac ":CR")
(setq eol-mnemonic-undecided ":?")
(setq eol-mnemonic-unix ":LF")

(defalias 'read-buffer-file-coding-system 'lawlist-read-buffer-file-coding-system)
(defun lawlist-read-buffer-file-coding-system ()
  (let* ((bcss (find-coding-systems-region (point-min) (point-max)))
         (css-table
          (unless (equal bcss '(undecided))
            (append '("dos" "unix" "mac")
                    (delq nil (mapcar (lambda (cs)
                                        (if (memq (coding-system-base cs) bcss)
                                            (symbol-name cs)))
                                      coding-system-list)))))
         (combined-table
          (if css-table
              (completion-table-in-turn css-table coding-system-alist)
            coding-system-alist))
         (auto-cs
          (unless find-file-literally
            (save-excursion
              (save-restriction
                (widen)
                (goto-char (point-min))
                (funcall set-auto-coding-function
                         (or buffer-file-name "") (buffer-size))))))
         (preferred 'utf-8-unix)
         (default 'utf-8-unix)
         (completion-ignore-case t)
         (completion-pcm--delim-wild-regex ; Let "u8" complete to "utf-8".
          (concat completion-pcm--delim-wild-regex
                  "\\|\\([[:alpha:]]\\)[[:digit:]]"))
         (cs (completing-read
              (format "Coding system for saving file (default %s): " default)
              combined-table
              nil t nil 'coding-system-history
              (if default (symbol-name default)))))
    (unless (zerop (length cs)) (intern cs))))
lawlist
  • 13,099
  • 3
  • 49
  • 158
  • 1
    I don't know why even this didn't work, but thank you very much for your snippet, I'll try it! – tonix Dec 23 '13 at 15:30
  • If the files that you normally work on have extensions, go ahead and add them like I did in the example -- `file-coding-system-alist`. I even added one that does not have an extension -- e.g., `user-prefs` – lawlist Dec 23 '13 at 16:55
3

For some reason, Windows started interpreting my init.el file as being encoded in something other than UTF-8, and choked on characters such as "ö" and "§". The solution was to add a line ; -*- coding: utf-8 -*- at the start of the file.

To make very sure that UTF-8 is used in every case, I have the following lines in init.el:

;; Use UTF-8 for all character encoding.
(set-language-environment 'utf-8)
(set-default-coding-systems 'utf-8)
(set-selection-coding-system 'utf-8)
(set-locale-environment "en.UTF-8")
(prefer-coding-system 'utf-8)
(setq utf-translate-cjk-mode nil) ; disable CJK coding/encoding
Teemu Leisti
  • 3,750
  • 2
  • 30
  • 39
2

To get back the described old behavior, try adding

(set-language-environment "UTF-8")

to your .emacs startup file.

hillu
  • 9,423
  • 4
  • 26
  • 30
  • I have tried that, but it did not work. I don't know what I mess... but it does not work... Any other way? – tonix Dec 21 '13 at 20:55
  • Does starting emacs with `-Q` (`--no-init-file --no-site-file --no-splash`) change anything? – hillu Dec 22 '13 at 16:01
  • No It does not change anything which concerns the buffer charset, it only loads emacs without the .emacs file, but in fact I still get the trouble with the encoding... What should I do? – tonix Dec 22 '13 at 16:20
  • Tell us more about the environment (OS/Distribution, Emacs version). – hillu Dec 22 '13 at 16:23
  • I use Emacs 23.4 on Ubuntu 13.04 – tonix Dec 22 '13 at 16:27
  • You should have added that to your question in the first place. When I run `emacs -Q` on Emacs 24.3.1 (Debian/unstable) and do `C-x C-b foo.txt RET`, I get a buffer with the `U:` indicator. You can always set the encoding for specific buffers using `set-buffer-file-coding-system` (`C-x C-m C-f`)... – hillu Dec 22 '13 at 16:38
  • Yes I also tried that command (C-x C-m f), setted the utf-8-unix encoding explicitly, saved the file, but when I killed the buffer and then visited the file again, the coding I have set disappeared and the file is automatically with the "--:---" indicator above the mini-buffer... I do not know what to do now, is there another way? – tonix Dec 22 '13 at 16:51
  • I have a list of ´(setenv "LANG" "UTF-8") (setenv "LC_CTYPE" "UTF-8") (setenv "LC_NUMERIC" "UTF-8") (setenv "LC_TIME" "UTF-8") (setenv "LC_COLLATE" "UTF-8") and wonder if this is the same. Thanks! – Emmanuel Goldstein Feb 15 '22 at 08:37