1

I'm not even sure how to ask this properly but from my limited understanding of what standard input really is and the fact that:

There Ain’t No Such Thing As Plain Text. If you have a string, in memory, in a file, or in an email message, you have to know what encoding it is in or you cannot interpret it or display it to users correctly

Then I think this question makes sense: what is the default encoding of bash ?

In the context of a Node.js app listening for inputs from the user:

process.stdin
    .pipe(through(chunk, _, next) {
       console.log(chunk.toString()) // chunk is an instance of Buffer
       next()
    })

When launching this with node app.js and pressing the character a it's get decoded back correctly ("standard output" displays a), which according to the doc means that it's being decoded as a utf-8 encoded character (because it's the default of the toString() method).

Since utf8 is backward compatible with ascii, when I press the char a, it could be either ascii or utf8 (or any other ascii compatible encoding for that matter though i'm assuming it's unlikely). So this boils down to:

  • Which program is responsible for encoding it (terminal? bash?)
  • What encoding is it ? ascii? utf8?

I'm so confused

Community
  • 1
  • 1
Radioreve
  • 3,173
  • 3
  • 19
  • 32
  • i found some answers there but not all of them. E.g. when they talk about "utf-8 terminal".. – Radioreve Sep 04 '18 at 13:22
  • A terminal emulator is the program that runs your console (`XTerm`, `Konsole`, `iTerm`, ...). In order for the locale to be properly taken into account, your terminal emulator must also be able to display utf-8 characters (most of modern terminal emulators apps are). The other point raised in the duplicate answer is about TTYs, which are rudimentary consoles (you usually access them by pressing `CTRL`+ `ALT` + `F1`). In order to enable utf-8 support on linux machines, they need to run the program `unicode_start`. – Aserre Sep 04 '18 at 13:50
  • 1
    As far as the OS is concerned, files (and pipes) are just streams of raw bytes; it's up to the programs producing and consuming those streams of bytes to decide how to interpret them. The Locale system (see linked question) can give programs hints about what encoding to use, but ultimately it's up to you to make sure the producing and consuming program agree on the data format. – Gordon Davisson Sep 04 '18 at 14:38

0 Answers0