35

I looked at the Haskell 2010 report and noticed a weird escape sequence with an ampersand: \&. I couldn't find an explanation what this escape sequence should stand for. It also might only be located in strings. I tried print "\&" in GHCi, and it prints an empty string.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Nolan
  • 1,060
  • 1
  • 11
  • 34
  • 1
    It’s an empty escape for breaking up other meaningful sequences. I don’t remember exactly what it lets you write – `"\012\&3"` or something? – Ry- Jul 09 '19 at 22:49
  • 7
    The explanation you seek is [in section 2.6, Character and String Literals](https://www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-200002.6). – Daniel Wagner Jul 09 '19 at 23:11

1 Answers1

43

It escapes... no character. It is useful to "break" some escape sequences. For instance we might want to express "\12" ++ "3" as a single string literal. If we try the obvious approach, we get

"\123" ==> "{"

We can however use

"\12\&3"

for the intended result.

Also, "\SOH" and "\SO" are both valid single ASCII character escapes, making "\SO" ++ "H" tricky to express as a single literal: we need "\SO\&H" for that.

This escape trick is also exploited by the standard Show String instance, which has to produce a valid literal syntax. We can see this in action in GHCi:

> "\140" ++ "0"
"\140\&0"
> "\SO" ++ "H"
"\SO\&H"

Further, this greatly helps external programs which aim to generate Haskell code (e.g. for metaprogramming). When emitting characters for a string literal, the external program can add \& at the end of potentially ambiguous escapes (or even of all escapes) so that the program does not have to handle unwanted interactions. E.g. if the program wants to emit \12 now, it can emit \12\& and be free to emit anything as the next character. Otherwise, the program should remember that, when the next character is emitted, it has to be prepended by \& if it's a digit. It's simpler to always add \&, even if it's not needed: \12\&A is legal, and has the same meaning as \12A.

Finally, a quote from the Haskell Report, explaining \&:

2.6 Character and String Literals

[...]

Consistent with the "maximal munch" rule, numeric escape characters in strings consist of all consecutive digits and may be of arbitrary length. Similarly, the one ambiguous ASCII escape code, "\SOH", is parsed as a string of length 1. The escape character \& is provided as a "null character" to allow strings such as "\137\&9" and "\SO\&H" to be constructed (both of length two). Thus "\&" is equivalent to "" and the character '\&' is disallowed. Further equivalences of characters are defined in Section 6.1.2.

AJF
  • 11,767
  • 2
  • 37
  • 64
chi
  • 111,837
  • 3
  • 133
  • 218
  • 6
    It’s also the zero-width equivalent of a *gap*: a backslash followed by some whitespace and another backslash is stripped from the string to allow multi-line string literals, but with no intervening whitespace this gives a single backslash. I’ve found it useful to help syntax highlighters that get tripped up on gaps at the end of a string, since the literal ends with `\"` but that’s not an escaped quotation mark. – Jon Purdy Jul 09 '19 at 23:29