0

An example:

var a = $"Some value 1: {b1:0.00}\nSome value 2: {b2}\nSome value 3: {b3:0.00000}\nSome value 4: {b4:0.00}\nSome value 5: {b6:0.0}\nSome value 7: {b7:0.000000000}";

That's somewhat hard to read source.

I can do it

var a = $"Some value 1: {b1:0.00}\n" +
        $"Some value 2: {b2}\n" +
        $"Some value 3: {b3:0.00000}\n" +
        $"Some value 4: {b4:0.00}\n" +
        $"Some value 5: {b6:0.0}\n" +
        $"Some value 7: {b7:0.000000000}";

But here is a comment saying what this will be multiple calls to string.Format and I think it will (no idea how to check it, IL is a black box for me yet).

Question: is it ok to do? What are other options to split long interpolated string?

Community
  • 1
  • 1
Sinatr
  • 20,892
  • 15
  • 90
  • 319
  • Since you have newlines in your string, why not use what's in [the other answer](http://stackoverflow.com/a/31766560/791010) on the question you linked to, but with the newlines outside of the braces? – James Thorpe Mar 01 '16 at 13:05
  • _"is it ok to do?"_ - do you care about nanoseconds, or readability? – CodeCaster Mar 01 '16 at 13:05
  • @JamesThorpe, it's ugly. – Sinatr Mar 01 '16 at 13:05
  • 1
    Look at using a [`StringBuilder`](https://msdn.microsoft.com/en-GB/library/system.text.stringbuilder(v=vs.110).aspx) – ChrisF Mar 01 '16 at 13:06
  • @ChrisF, I like string interpolation feature, can you demonstrate what you mean please? – Sinatr Mar 01 '16 at 13:06
  • 1
    My point is that this is opinion-based. What you find ugly (your words), another can find aesthetically pleasing. If there is a format that is the most readable to you while not significantly harming performance, why should you care, and how could we? – CodeCaster Mar 01 '16 at 13:07
  • @CodeCaster, simple: long string in source, hard to read. What would you do at my place? Shall I make an example line 10 times longer to make problem more clear? – Sinatr Mar 01 '16 at 13:08
  • I don't think my opinion matters, but I will slap the person who wrote the string in the first place. – CodeCaster Mar 01 '16 at 13:09
  • 1
    @FᴀʀʜᴀɴAɴᴀᴍ because you don't want strings like that in source code anyway, and that code should not have passed peer review? There are tons of ways for building strings, and this is about the ugliest way there is. I would take back a step and wonder why this code is there in the first place, but again, opinion-based, so not really answerable. – CodeCaster Mar 01 '16 at 13:12
  • @CodeCaster, good point. How will you define a slim edge after that normal interpolated string become an *ugliest way* please? – Sinatr Mar 01 '16 at 13:14
  • @Sinatr I'm not going to expand on my view on coding style, and especially strings in code, in a Stack Overflow comment if you don't mind. And I did not vote to close as duplicate, let that be clear. – CodeCaster Mar 01 '16 at 13:17
  • @CodeCaster, try to answer, perhaps you will solve my X problem. I hate old `string.Format` with confusing `{0}` thingies, but that was easy to split in multiple lines. String interpolation is ace (readability), but can't be split without side effect. I don't know how to construct resulting string better (in my imagination `StringBuilder` draw something ugly and monstrous). – Sinatr Mar 01 '16 at 13:24
  • @Sinatr: CodeCaster is talking about these (at least three) issues: 1. you have a translatable resource which should not be hardcoded, 2. concatenating a string with newlines is almost certainly unnecessary (what are you doing with this, printing to a console?), 3. formatting a string with 7 different variables means you should most likely have 7 key/value pairs, or an array of 7 values, or some similar way to avoid code duplication. – vgru Mar 01 '16 at 13:57
  • @GvS, Fᴀʀʜᴀɴ: not sure why this question was reopened, but I am still pretty sure it's either a duplicate (*What are other options to split long interpolated string?* - that's *verbatim interpolated string* precisely, like Eric Lippert wrote below), or opinion based (*Is it ok to do?* - that would be "ok to do", in what objective sense?). – vgru Mar 01 '16 at 14:01
  • On my system, the second variation runs about 25% slower. Mind you, a call to my test function takes less than 0.004ms. At that point, things like function overhead start being visible in the test. I don't think efficiency is a concern. Keep in mind that `String.Format` performance is more influenced by string length than function overhead; 7 calls to string.format on a string isn't much different than 1 call to a string that is 7x as long. – Brian Mar 01 '16 at 14:22
  • @Brian mind you that the speed of string concatenation also depends on the length of the string and that `string.Format` has to overallocate. More calls is usually slower with strings. The .NET runtime just does a great job here. That said, it is well known that bad use of strings can lead to performance issues, which is why f.ex. `StringBuilder` originated in the first place. – atlaste Mar 02 '16 at 07:55

2 Answers2

7

what this will be multiple calls to string.Format and I think it will

You're right. You haven't said why you care. Why is that to be avoided?

is it ok to do?

It's fine by me.

What are other options to split long interpolated string?

I would use a verbatim interpolated string. That will solve your problem nicely.

See

How do you use verbatim strings with interpolation?

(Since that is the link you mentioned in the question I am not 100% clear on why you asked this question, since you already read a page that suggested a good answer.)

I don't like $@ idea, it makes it worse than long string

You might have said that earlier.

can't it be accidentally damaged by reformatting sources?

All code can be changed by changing the sources.

What are other options to split long interpolated string?

Don't interpolate in the first place. Make the string a resource, make a class responsible for fetching formatted resource strings, and hide the implementation details of how you format the string inside methods of the class.

Community
  • 1
  • 1
Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 1
    I don't like `$@` idea, it makes it worse than long string as for my taste (I could be wrong, but can't it be accidentally damaged by reformatting sources?). I think single `string.Format` call is more efficient than multiple, this is why I care in *general*. – Sinatr Mar 01 '16 at 13:18
  • 6
    @Sinatr: Concentrate on the code being correct, readable and maintainable first; worry about "efficiency" when you have an empirically demonstrated performance problem that is fixed by changing your correct, readable code. – Eric Lippert Mar 01 '16 at 13:22
  • 1
    *"All code can be changed by changing the sources"* - you are right, what I mean is *code formatting* (e.g.when moving from one class to another). I never control how formatting is done, using `$@` makes me obliged to do so. – Sinatr Mar 01 '16 at 13:28
  • I don't like idea with resources. Something in your words sounds like iteration over source of data (and this makes me understand why people say `StringBuilder`). If I would have to output all properties or all dictionary entries, then I would not have issues. But in my example `b1`, `b2`, etc. are simple local variable. Well, I can put their values into something (e.g. `List<>`) first.... What I would like is to have compiler to be smart enough and combine multiple `$"..." +` into single `string.Format`. Then there will be no question. – Sinatr Mar 01 '16 at 13:37
  • @EricLippert I agree with your answer, but I've found it puzzling that strings get concatenated during compile time, while interpolated strings do not (see my own answer). I've been thinking about a reason for this behavior (like Exception behavior), but don't see any reason why it behaves like this. Can you think of any good reason (except for "is not implemented (yet)")? – atlaste Mar 02 '16 at 07:58
  • @atlaste: Like all features, it's a matter of assigning effort to solving problems that produce large return on that effort. There's no theoretical objection to improving the code generation for combined interpolated strings. It's just likely not a big win; that effort could be spent on better features. – Eric Lippert Mar 02 '16 at 15:57
3

What does the compiler do?

Let's start here:

var a = $"Some value 1: {b1:0.00}\n" +
        $"Some value 2: {b2}\n" +
        $"Some value 3: {b3:0.00000}\n" +
        $"Some value 4: {b4:0.00}\n" +
        $"Some value 5: {b6:0.0}\n" +
        $"Some value 7: {b7:0.000000000}";

IL is a black box for me yet

Why not simply Open it up? That's pretty easy using a tool like ILSpy, Reflector, etc.

What will happen in your code is that each line is compiled to a string.Format. The rule is pretty simple: if you have $"...{X}...{Y}..." it will be compiled as string.Format("...{0}...{1}...", X, Y). Also the + operator will introduce a string concatenation.

In more detail, string.Format is a simple static call, which means that the compiler will use the call opcode instead of callvirt.

From all this you might deduce that it's pretty easy for a compiler to optimize this: if we have an expression like constant string + constant string + ... you can simply replace it with constant string. You can argue that the compiler has knowledge about the inner workings of string.Format and string concatenation and handle that. On the other hand, you could argue that it should not. Let me detail the two considerations:

Note that strings are objects in .NET, but they are 'special ones'. You can see this from the fact that there's a special ldstr opcode, but also if you check out what happens if you switch on a string -- the compiler will generate a dictionary. So, from this you could deduce that the compiler 'knows' how a string works internally. Let's figure out if it knows how to do concatenation, ok?

var str = "foo" + "bar";
Console.WriteLine(str);

In IL (Release mode of course) this will give:

L_0000: ldstr "foobar"

tl;dr: So, regardless if the concatenation of interpolated strings are already implemented or not (they are not), I'd be pretty confident that the compiler will handle this case eventually.

What does the JIT do?

Next question would be: how smart is the JIT compiler with strings?

So, let's consider for a moment that we will teach the compiler about all the inner workings of string. First we should note that C# is compiled to IL, which is JIT compiled to assembler. In the case of the switch it's pretty hard for the JIT compiler to create the dictionary, so we have to do it in the compiler. On the other hand, if we're handling more complex concatenation it makes sense to use the things we already have available for f.ex. integer arithmetic to do string operations as well. This implies putting string operations in the JIT compiler. Let's for a moment consider that with an example:

var str = "";
for (int i=0; i<10; ++i) {
    str += "foo";
}
Console.WriteLine(str);

The compiler will simply compile the concatenation to IL, which means that the IL will hold a pretty straight-forward implementation of this. In this case loop unrolling arguably has a lot of benefits for the (runtime) performance of the program: it can simply unroll the loop, appending the string 10 times, which results in a simple constant.

However, giving this knowledge to the JIT compiler makes it more complex, which means that the runtime will spend more time on JIT compiling (figuring out the optimization) and less time executing (running the emitted assembler). Question that remains is: what will happen?

Start the program, put a breakpoint on the writeline and hit ctrl-alt-D and see the assembler.

00007FFCC8044413  jmp         00007FFCC804443F  
            {
                str += "foo";
00007FFCC8044415  mov         rdx,2BEE2093610h  
00007FFCC804441F  mov         rdx,qword ptr [rdx]  
00007FFCC8044422  mov         rcx,qword ptr [rbp-18h]  
00007FFCC8044426  call        00007FFD26434CC0  

[...]
00007FFCC804443A  inc         eax  
00007FFCC804443C  mov         dword ptr [rbp-0Ch],eax  
00007FFCC804443F  mov         ecx,dword ptr [rbp-0Ch]  
00007FFCC8044442  cmp         ecx,0Ah  
00007FFCC8044445  jl          00007FFCC8044415  

tl;dr: Nope, that's not optimized.

But I want the JIT to optimize that as well!

Yea, well, I'm not too sure if I share that opinion. There's a balance between runtime performance and time spent in JIT compilation. Notice that if you're doing something like this in a tight loop, I would argue that you're asking for trouble. On the other hand, if it's a common and trivial case (like the constants that are concatenated) it's pretty easy to optimize and it doesn't affect the runtime.

In other words: arguably, you don't want this to be optimized by the JIT, assuming that would take too much time. I'm confident we can trust Microsoft in making this decision wisely.

Also, you should realize that strings in .NET are heavily optimized things. We all know that they're used a lot, and so does Microsoft. If you're not writing 'really stupid code', it's a very reasonable assumption that it will perform just fine (until proven otherwise).

Alternatives?

What are other options to split long interpolated string?

Use resources. Resources are a useful tool in dealing with multiple languages. And if this is just a small, non-professional project - I simply wouldn't bother at all.

Alternatively you can use the fact that constant strings are concatenated:

var fmt = "Some value 1: {1:0.00}\n" +
          "Some value 2: {2}\n" +
          "Some value 3: {3:0.00000}\n" +
          "Some value 4: {4:0.00}\n" +
          "Some value 5: {6:0.0}\n" +
          "Some value 7: {7:0.000000000}";

var a = string.Format(fmt, b1, b2, b3, b4, b5, b6, b7);
atlaste
  • 30,418
  • 3
  • 57
  • 87
  • Thanks for nice explanation. While I don't like `{0}` anymore it seems what using `string.Format` is the right way in this case. – Sinatr Mar 02 '16 at 08:38