There is a concept of incidental white space.
Compile-time processing
A text block is a constant expression of type String, just like a string literal. However, unlike a string literal, the content of a text block is processed by the Java compiler in three distinct steps:
Line terminators in the content are translated to LF (\u000A). The purpose of this translation is to follow the principle of least surprise when moving Java source code across platforms.
Incidental white space surrounding the content, introduced to match the indentation of Java source code, is removed.
Escape sequences in the content are interpreted. Performing interpretation as the final step means developers can write escape sequences such as \n without them being modified or deleted by earlier steps.
...
Incidental white space
Here is the HTML example using dots to visualize the spaces that the
developer added for indentation:
String html = """
..............<html>
.............. <body>
.............. <p>Hello, world</p>
.............. </body>
..............</html>
..............""";
Since the opening delimiter is generally positioned to appear on the same line as the statement or
expression which consumes the text block, there is no real
significance to the fact that 14 visualized spaces start each line.
Including those spaces in the content would mean the text block
denotes a string different from the one denoted by the concatenated
string literals. This would hurt migration, and be a recurring source
of surprise: it is overwhelmingly likely that the developer does not
want those spaces in the string. Also, the closing delimiter is
generally positioned to align with the content, which further suggests
that the 14 visualized spaces are insignificant.
...
Accordingly, an appropriate interpretation for the content of a text block is to differentiate incidental white space at the start and end of each line, from essential white space. The Java compiler processes the content by removing incidental white space to yield what the developer intended.
Your assumption that
Hello,
Java 13
<empty line>
equals
....Hello,
....Java 13
<empty line>
is inaccurate since those are essential white spaces and they will not be removed by either the compiler or String#stripIndent
.
To make it clear, let's keep representing an incidental white space as a dot.
String hello = """
....Hello,
....Java 13
....""";
String hello2 = """
Hello,
Java 13
""";
Let's print them.
Hello,
Java 13
<empty line>
Hello,
Java 13
<empty line>
Let's call String#stripIndent
on both and print the results.
Hello,
Java 13
<empty line>
Hello,
Java 13
<empty line>
To understand why nothing has changed, we need to look into the documentation.
Returns a string whose value is this string, with incidental white space removed from the beginning and end of every line.
Then, the minimum indentation (min) is determined as follows. For each non-blank line (as defined by isBlank()
), the leading white space characters are counted. The leading white space characters on the last line are also counted even if blank. The min value is the smallest of these counts.
For each non-blank line, min leading white space characters are removed, and any trailing white space characters are removed. Blank lines are replaced with the empty string.
For both String
s, the minimum indentation is 0
.
Hello, // 0
Java 13 // 0 min(0, 0, 0) = 0
<empty line> // 0
Hello, // 4
Java 13 // 4 min(4, 4, 0) = 0
<empty line> // 0
String#stripIndent
gives developers access to a Java version of the re-indentation algorithm used by the compiler.
The re-indentation algorithm will be normative in The Java Language Specification. Developers will have access to it via String::stripIndent
, a new instance method.
The string represented by a text block is not the literal sequence of characters in the content. Instead, the string represented by a text block is the result of applying the following transformations to the content, in order:
Line terminators are normalized to the ASCII LF character (...)
Incidental white space is removed, as if by execution of String::stripIndent
on the characters in the content.
Escape sequences are interpreted, as in a string literal.