3

I'm just trying to create some interpreter for a script language, and one of the things I meet now is how to interpret a two-chars long char (with \) into actual char,

For example: things you cannot type within only single char: '\n' for newline, and '\'' for ' and so on..

The information my interpreter gets is such string: "\\n" because the "\" is read before the "n" by order char after char in a loop from the text the user types in the editor.

user2864740
  • 60,010
  • 15
  • 145
  • 220
David von Tamar
  • 797
  • 3
  • 12
  • 29
  • 2
    Do you have any code to share? I'm not sure what you're trying to accomplish. – Jeremy West Apr 15 '14 at 00:39
  • In the original question, you had wanted to interpret `'\n'` rather than `"\\n"`, can you not simply replace it with the newline character code whenever you read the backslash? – Couchy Apr 15 '14 at 00:46
  • Yep sorry, I was a bit confused myself between "\\n" and @"\n" and forgot to put at-sign @ before the string in the title. Many thanks for the editors. – David von Tamar Apr 15 '14 at 00:54

2 Answers2

4

As I understand you have "\\n" in your string, the easiest way to do this is to replace "\\n" with "\n" before processing it.

string replaced = original.Replace("\\n", "\n");

If you want to replace any escaped char you can use Regex.Unescape.

Beware that unescape will try to unescape everything, so if you want to unescape only "\\." sequences first use a regex to match them (something like "\\\\[a-zA-Z0-9]"), then iterate through the results and replace with the unescaped version.

Gusman
  • 14,905
  • 2
  • 34
  • 50
  • 1
    @D.Diamond [C# string literals](http://msdn.microsoft.com/en-us/library/ms228362.aspx) only accept a small number of `\escapes`. – user2864740 Apr 15 '14 at 00:46
  • Updated with solution for any escaped chars – Gusman Apr 15 '14 at 00:48
  • try `\\\n` instead of `\\n`? or even `\\\\n` whichever works. – SSpoke Apr 15 '14 at 00:50
  • Okay, I will check the Regex Unescape thing, I will soon update and mark as answer if the problem solved. – David von Tamar Apr 15 '14 at 00:51
  • Beware that unescape will try to unescape everything, so if you want to unescape only \\ chars first use a regex to match them, then iterate through the results and replace with the unescaped version. – Gusman Apr 15 '14 at 00:52
  • This thing worked. thanks. No it's not only \\, it's everything that starts with \, so it does exactly what I need. that's a solid solution. – David von Tamar Apr 15 '14 at 01:03
  • What I meant is it will escape \, *, +, ?, |, {, [, (,), ^, $,., #, so a regex like "\\\\[a-zA-Z]" should be used to match only '\\' scaped chars – Gusman Apr 15 '14 at 01:07
  • Regex.Escape is dubious in this case. The reason for that is as follows: `Regex.Escape(@"hello*world")` results in the string value of `hello\*world`; however, `"hello\*world"` is an invalid literal. So while it does "escape" the characters, it does so inappropriately for the given context. – user2864740 Apr 15 '14 at 01:14
  • As I said on the comments, updated to reflect it, but hey, it does what the user wants. – Gusman Apr 15 '14 at 01:16
-2

There is no standard method to turn an escape sequence such as \n in a string into a single character '\n', as happens during the parsing of string literals. However, it's not terribly hard to make a simple replacement function.

For example, consider the following skeleton (it doesn't handle \U, \u or \x, but that can be expanded):

string EscapeLikeALiteral (string src) {
    return Regex.Replace(src, @"\\(?<simple>['""\\0abfnrtv])", (m) => {
       var s = m.Groups["simple"].Value;
       switch (s) {
           case "'": return "'";
           case "\"": return "\"";
           case "0": return "\0";
           case "a": return "\a";
           case "b": return "\b";
           case "f": return "\f";
           case "n": return "\n";
           case "r": return "\r";
           case "t": return "\t";
           case "v": return "\v";
           default:
               throw new InvalidOperationException();
       }
    });
}

var r = EscapeLikeALiteral(@"hello\nworld");
user2864740
  • 60,010
  • 15
  • 145
  • 220
  • There it is, Regex.Unescape – Gusman Apr 15 '14 at 01:05
  • @Gusman No, Regex.Unescape is *not* the same - [Regex.Unescape](http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.unescape(v=vs.110).aspx) is the opposite of Regex.Escape, but is not equivalent to this code or operation as indicated by the method name and remainder of the answer. In particular, Regex.Escape will "reverse" some mappings *not* valid in literals. – user2864740 Apr 15 '14 at 01:07
  • @D.Diamond Just read the documentation first. If you can live with those rules, then OK. – user2864740 Apr 15 '14 at 01:08
  • From MSDN: "Escapes a minimal set of characters (\, *, +, ?, |, {, [, (,), ^, $,., #, and white space) by replacing them with their escape codes.", so it escapes anything like \n, \t... so if unescape does exactly the opposite, then there it is – Gusman Apr 15 '14 at 01:10
  • @Gusman `Regex.Escape(@"hello*world")` results in the string value of `hello\*world`; however, `"hello\*world"` is an *invalid* literal. So while it does "escape" the characters, it does so *inappropriately* for the given context. – user2864740 Apr 15 '14 at 01:11
  • Yes, it's correct. ¿so?, It escapes more things as I stated in the comments, but it escapes/unescapes anything with '\.' or "\\." what he is asking for. – Gusman Apr 15 '14 at 01:12
  • @Gusman The statement "There it is, Regex.Unescape" is simply incorrect given the context. I've provided a working minimal skeleton to escape strings as though they were C# string literals (minus the \U, \u and \x forms). – user2864740 Apr 15 '14 at 01:14
  • Lol, rage downvote XD, accept that you were wrong kiddo – Gusman Apr 15 '14 at 01:14
  • @Gusman The downvote is merely because of an invalid argument for a dubious suggestion, which I have once again outlined. – user2864740 Apr 15 '14 at 01:15
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/50653/discussion-between-gusman-and-user2864740) – Gusman Apr 15 '14 at 01:17