3

I have tab characters assigned in ASCII (09) or Unicode

char ch = '\x09';
(or)
char ch = '\u0009';

How do I print '\t' in console window ?

None of the below works.(maybe it prints a tab, but not as a canonical representation of '\t')

Console.Write(ch);
Console.Write(ch.ToString())

Guess,Console.Write() is not the right way to do it

Antony Thomas
  • 3,576
  • 2
  • 34
  • 40
  • Do you want a representation for all controls, or just for U+0009? – Jon Hanna Sep 06 '12 at 23:28
  • All control characters. I can easily get it in Python or for that sake even in mono interpreter or Roslyn plugin for VS. But just can't get it in the Console Window. – Antony Thomas Sep 07 '12 at 04:04

4 Answers4

7

Control characters don't display, since that is the whole point of a control character. The reason for putting a tab in a piece of text is to have a tab, after all.

Ideally, we could use the standard symbols ␀␁␂␃␄␅␆␇␈␉␊␋␌␍␎␏␐␑␒␓␔␕␖␗␘␙␚␛␜␝␞␟␠␡␤␥␦ but they don't have great font support (right now as I see it, the symbol for Delete Form Two and Subsitute Form Two aren't showing correctly) and this is even worse with the console.

You're also not clear in your question whether you want the canonical representation (U+0009) or the C# escape (\t) as you ask for one right after asking for the other ("a canonical representation of '\t'").

Assuming the latter, we've an issue in that C# only provides such short-cut escapes for 8 of the control characters. The process will also requires us to escape \ for the same reasons that C# does - how otherwise can we detect whether \t means tab, or means \ followed by t?

So, assuming you want a form that you could then use directly in C# again, we can do the following:

public static class StringEscaper
{
    public static string EscapeForCSharp(this string str)
    {
        StringBuilder sb = new StringBuilder();
        foreach(char c in str)
            switch(c)
            {
                case '\'': case '"': case '\\':
                    sb.Append(c.EscapeForCSharp());
                    break;
                default:
                    if(char.IsControl(c))
                        sb.Append(c.EscapeForCSharp());
                    else
                        sb.Append(c);
                    break;
            }
        return sb.ToString();
    }
    public static string EscapeForCSharp(this char chr)
    {
        switch(chr)
        {//first catch the special cases with C# shortcut escapes.
            case '\'':
                return @"\'";
            case '"':
                return "\\\"";
            case '\\':
                return @"\\";
            case '\0':
                return @"\0";
            case '\a':
                return @"\a";
            case '\b':
                return @"\b";
            case '\f':
                return @"\f";
            case '\n':
                return @"\n";
            case '\r':
                return @"\r";
            case '\t':
                return @"\t";
            case '\v':
                return @"\v";
            default:
                //we need to escape surrogates with they're single chars,
                //but in strings we can just use the character they produce.
                if(char.IsControl(chr) || char.IsHighSurrogate(chr) || char.IsLowSurrogate(chr))
                    return @"\u" + ((int)chr).ToString("X4");
                else
                    return new string(chr, 1);
        }
    }
}

Now we can test it with both a string and a single char.

Single char:

Console.WriteLine('\t'.EscapeForCSharp());

Outputs:

\t

String:

string str = "The following string contains all the \"C0\" and \"C1\" controls, escaped with \\ as per C# syntax: "
  + "\u0000\u0001\u0002\u0003\u0004\u0005\u0006\u0007\u0008\u0009\u000A\u000B\u000C\u000D\u000E\u000F\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001A\u001B\u001C\u001D\u001E\u001F\u007F\u0080\u0081\u0082\u0083\u0084\u0085\u0086\u0087\u0088\u0089\u008A\u008B\u008C\u008D\u008E\u008F\u0090\u0091\u0092\u0093\u0094\u0095\u0096\u0097\u0098\u0099\u009A\u009B\u009C\u009D\u009E\u009F";
Console.WriteLine(str.EscapeForCSharp());

Outputs:

The following string contains all the \"C0\" and \"C1\" controls, escaped with \\ as per C# syntax: \0\u0001\u0002\u0003\u0004\u0005\u0006\a\b\t\n\v\f\r\u000E\u000F\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001A\u001B\u001C\u001D\u001E\u001F\u007F\u0080\u0081\u0082\u0083\u0084\u0085\u0086\u0087\u0088\u0089\u008A\u008B\u008C\u008D\u008E\u008F\u0090\u0091\u0092\u0093\u0094\u0095\u0096\u0097\u0098\u0099\u009A\u009B\u009C\u009D\u009E\u009F
Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • I thought canonical form is the standardized `'\t'` per this[1], but sorry if I got the verbiage wrong. Nevertheless, I understand you point about non-printable characters cannot be printed. The reason I started this post is this simple experiment(driven by reasons out of scope); In python interpreter, I type `a='\x09'` and `str(a)` or just `a` and it shows me `'\t'`. Then in mono interpreter I type `char ch = '\x09'; ch;` and it shows me `'\t'` as well. But I just can't get it in c# console. Anyway, thank you for the elaborate answer. – Antony Thomas Sep 07 '12 at 14:47
  • [1] Link about canonical http://stackoverflow.com/questions/1167371/what-does-canonical-representation-mean-and-its-potential-vulnerability-to-websi – Antony Thomas Sep 07 '12 at 14:49
  • 1
    In the mono console you get a real tab too. The interpreters are to help you debug and try out things so `\t` is useful, but the consoles are for real work where `\t` would be useless most of the time (how then would you actually output a tab?). "Canonical" means "a way to restrict a choice of equally valid representations of the same data to a single choice-free format". The `U+0009` I give above is the canonical form of the Unicode language-agnostic way to represent a character independent of its glyph or lack thereof (so it works just as well for tab as for A, B or C)... – Jon Hanna Sep 07 '12 at 15:06
  • ... there's no canonical C# escape, because C# was designed to be written by people rather than produced and interchanged as source files, so there's no need to say `\t` is more or less canonical than `\u0009` or even than ` ` straight in the source. – Jon Hanna Sep 07 '12 at 15:08
  • Thanks for the perspective. I was stupid to vie for the .000001% usage of Console. – Antony Thomas Sep 07 '12 at 16:21
  • Not stupid, just focused on your own problems, which is mostly a good thing (would you like a programmer who didn't focus on the problems in front of them?) but it can make behaviour seems strange when it's actually reasonable. – Jon Hanna Sep 07 '12 at 16:55
2

There is no built in way to print characters as C# escape sequences.

Just use dictionary to map characters to strings as needed:

var map = new Dictionary<char, string>{ {'\t', @"\t"}}

And use it to replace characters (if present in map) during output.

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
  • This is just so interesting. So I fire up the mono interpreter and type `char ch = '\x09'; ch;` and ,boom, I get the output as '\t'. Not sure why this is so complicated in the Console window. – Antony Thomas Sep 07 '12 at 04:05
  • It's more complicated because the Console window is designed for use with actual output, not just for debugging. 99.99999% of the time somebody output a tab to the console window, it's because they wanted a tab, not an escape for a tab. – Jon Hanna Sep 07 '12 at 09:57
  • 2
    @AntonyThomas, VS have similar behavior (which you can easily spot by huge number of "how to remove escapes sequences from my string" questions) - immediate window and debug tooltips/variable views show you encoded string, but actual console output does not (because you really don't want to see text in your command prompt or files with all characters encoded). – Alexei Levenkov Sep 07 '12 at 16:07
  • Yes, there's an irony in that we very often get the exact opposite question, here. – Jon Hanna Sep 07 '12 at 17:01
0

Try this:

    string tab = "\u0009";
    Console.Write(tab.Replace(tab, "\\t"));
rumburak
  • 1,097
  • 11
  • 19
0

\t has a special meaning (a tab) so if you want to print it you need to escape the '\'.

Console.Writeline("\tHello World");
prints:     Hello World

Console.Writeline("\\tHello World");
prints: \tHello World

You can also use the @-Syntax to remove the special meanings of \t, \n, \'...

raisyn
  • 4,514
  • 9
  • 36
  • 55