4

I was analyzing a code sample in the accepted answer of this stackoverflow question, which contains this block of code:

public static void SplitFile(string inputFile, int chunkSize, string path)
{
    const int BUFFER_SIZE = 20 * 1024;
    byte[] buffer = new byte[BUFFER_SIZE];

    using (Stream input = File.OpenRead(inputFile))
    {
        int index = 0;
        while (input.Position < input.Length)
        {
            using (Stream output = File.Create(path + "\\" + index))
            {
                int remaining = chunkSize, bytesRead;
                while (remaining > 0 && (bytesRead = input.Read(buffer, 0,
                        Math.Min(remaining, BUFFER_SIZE))) > 0)
                {
                    output.Write(buffer, 0, bytesRead);
                    remaining -= bytesRead;
                }
            }
            index++;
            Thread.Sleep(500); // experimental; perhaps try it
        }
    }
}

And the following line threw me for a loop:

int remaining = chunkSize, bytesRead;

It is my understanding that, unlike many C++ operators, the comma operator was deliberately left out of the C# specification; yet, the code above compiles and runs just fine.

I know that you can declare multiple comma-separated variables like so:

int i, j, y;

and even set them

int i = 0, j = 1, y = 2;

But the line in question appears in a while loop and contains one variable that is (hopefully) already declared and initialized chunkSize, as well as one that gets set in a nested while loop below bytesRead. so the usage of the comma as a separator in a multiple variable declaration doesn't really make sense to me.

In the above C# code, what are the mechanics and behavior of the comma operator/separator? Also, is there a place in the specification where these things are defined?

dbc
  • 104,963
  • 20
  • 228
  • 340
Griswald_911
  • 119
  • 8
  • 3
    You are not parsing the statement correctly in your head. You correctly note that you can have a comma-separated list of variables in a declaration, and you correctly note that you can optionally assign initial values in the declaration. You can also mix-n-match those two syntaxes! You can have a comma-separated list of declarations with *some of them* having initializers, and that's what you've got. – Eric Lippert Jul 16 '18 at 18:08
  • 2
    Now, here's a fascinating question: what does this mean: `var x = 10, y = 123.4;`. Does that mean: `int x = 10; double y = 123.4;` or does it mean `double x = 10, y = 123.4;`. See if you can figure it out, and then try it. Were you right? – Eric Lippert Jul 16 '18 at 18:09
  • 3
    This is related to a famous design problem in VB, where you could say `Dim Curly, Larry, Moe as Stooge` and of course that meant: `Curly` and `Larry` are `Variant`, `Moe` is `Stooge`, which was very confusing. – Eric Lippert Jul 16 '18 at 18:11
  • 1
    Finally: is this legal and if yes, what is the value of `y`? `int M(out int x) { x = 123; return 345; } ... int y = M(out y);` – Eric Lippert Jul 16 '18 at 18:16
  • 3
    @EricLippert I sure do miss your Fabulous Adventures in Coding! – itsme86 Jul 16 '18 at 18:18
  • 1
    @EricLippert: Thanks for the explanation! It looks like Michael Stum already answered your question. I would have thought it would parse as a double for sure. I now know that implicitly typed local variables cannot have multiple declarators. – Griswald_911 Jul 16 '18 at 18:18
  • 3
    Thanks for the kind words. I just have had no time for blogging lately. I hope to get back into it! – Eric Lippert Jul 16 '18 at 18:20
  • 1
    For what it's worth, when you are writing code, and you find you have written something that would make little sense to a reader (or, even worse, surprise a reader), avoid that programming construct. Follow the "Principal of least astonishment" (or least surprise). That's something readers of @EricLippert 's blog have heard a lot about over the years. – Flydog57 Jul 16 '18 at 18:34
  • @EricLippert: With regard to your final brain teaser here -- `int M(out int x) { x = 123; return 345; } ... int y = M(out y);` So, is the moral that, since the out parameter is passed by reference and _must_ be assigned by the called method, it is simply overwritten by the secondary method assignment in the calling location? So (to illustrate) `int M(out int x) { x = 123; return 345; } ... Console.WriteLine(M(out int x)); ... Console.WriteLine(x);` results in the following output: `345` and then `123`, but `int x = M(out x);` simply results in `345`. – Griswald_911 Jul 17 '18 at 22:05
  • @Griswald_911: Correct. Another way to look at it is `int y = M(out y);` must have the same semantics as `int y; int t = M(out y); y = t;`. – Eric Lippert Jul 18 '18 at 17:31

2 Answers2

7

This is defining two variables.

int remaining = chunkSize, bytesRead;

is equivalent to

int remaining = chunkSize;
int bytesRead; // uninitialized, gets set to bytesRead = input.Read(buffer,... below

It's a bit of a code smell to me because multiple declarations in the same line are hard to follow and might give an impression that there is ANY connection between remaining and bytesRead even though there isn't.

bytesRead is set to a value in the line below (input.Read), but somehwere the compiler needs to know that bytesRead is an int. And C# allows uninitialized variables to be declared in a method, as long as you're initializing it somehwere before trying to use it.

Michael Stum
  • 177,530
  • 117
  • 400
  • 535
  • 2
    Oh man, I can't believe they allow this. This is very confusing to read, especially if you've never seen this before. – Lews Therin Jul 16 '18 at 18:08
  • 2
    @LewsTherin Yeah, though I guess this might just be a side effect of how the parser works? both `int x, y, z` (three uninitialized ints) and `int x = 0, y = 1, z = 2` are valid, so I guess that maybe no one thought about disallowing `int x = 0, y, z = 2, v, i, j = 5`. – Michael Stum Jul 16 '18 at 18:10
  • 2
    @LewsTherin Now, it IS disallowed with `var`, even if the initializers match. `var x = 10, y = 20` results in `CS0819 Implicitly-typed variables cannot have multiple declarators` – Michael Stum Jul 16 '18 at 18:11
  • 1
    @MichaelStum: Hah, I just posed that as a brain teaser for the original poster. Great minds think alike. – Eric Lippert Jul 16 '18 at 18:12
  • 2
    @EricLippert To be honest, I didn't know what that line did either, which is why I fired up LINQPad and tested it. At the risk of sounding like a salesman, but buying a license has been worth it a million times over because anytime I wonder "What would this do? Would it even work?" I can just quickly fire up linqpad and have my answer in 30 seconds instead of guessing. (Also, stuff like Does `Path.GetExtension` include the dot or not, or other things that can just easily be tested bu running some actual code). – Michael Stum Jul 16 '18 at 18:15
  • 3
    When we were designing the feature I did a poll of readers and we also polled MVPs, Microsoft insiders, and so on, as to what the "obvious" semantics of `var x = 10, y = 123.4;` should be, and it was about 50-50 which was the "obvious" meaning. So we made it illegal. A "convenience" feature that always fools half of the professionals using it is a bad feature! https://blogs.msdn.microsoft.com/ericlippert/2006/06/27/what-are-the-semantics-of-multiple-implicitly-typed-declarations-part-two/ – Eric Lippert Jul 16 '18 at 18:19
  • 1
    @EricLippert agreed, I'm happy that it's an error. var is meant to increase readability of code by avoiding stating the type twice (and allowing anonymous types to work), while it at the same time can obscure the type when used to get the return value of a method. So multiple declarations would make a feature that is meant to enhance readability into a feature that obscures it. – Michael Stum Jul 16 '18 at 18:22
1

I also think having more than one variable declared in the same line is confusing and a code smell, although the lang allows something else.

To support this in C# Layout Convention, this link states:

Write only one declaration per line.

D. Mayen
  • 424
  • 3
  • 8