2

I was reading this popular stack overflow question Creating a byte array from a stream and wanted to get some clarification on how byte arrays work.

in this chunk of code here:

 byte[] buffer = new byte[16 * 1024];
            using (MemoryStream ms = new MemoryStream())
            {
                int read;
                while ((read = PictureStream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    ms.Write(buffer, 0, read);
                }

                return ms.ToArray();
            }

Here's what I'm not understanding:

I'm getting lost on the size that this array is set to. For example, I use that code chunk to convert an image stream to a byte array, but i'm usually reading images that are larger than 2 megabytes, which is far larger than the size of the array that's reading in the picture- 16*1024 bytes. However, the above code converts the image from a stream to a byte array totally fine, no "out of bounds index" errors to be had.

So how is my array a smaller size than the photo I'm reading in, yet still manages to read it totally fine?

izzyk
  • 105
  • 1
  • 8
  • 1
    It's reading and writing it in chunks - it's called streaming. Read a buffer full, then write the buffer, and repeat until you've transferred all the data. – Matthew Watson Aug 14 '20 at 14:26
  • I see. I'll have to read some more about streaming. How do you figure out what size to make the byte array? Why 16 * 1024, opposed to some other number? – izzyk Aug 14 '20 at 14:28
  • 1
    It's kind of arbritary - one byte at a time would mean it would be really slow, but 1 GB at a time would waste a load of memory. 16K is just a reasonable buffer size. – Matthew Watson Aug 14 '20 at 14:29
  • 3
    [Best memory buffer size](https://stackoverflow.com/q/3033771/1997232). – Sinatr Aug 14 '20 at 14:29

2 Answers2

2

The array you pass is just a buffer. When you read from the stream it returns the number of bytes read and populates the buffer array with that many elements (it is not always fully filled). Then you write that many bytes to the memory stream. This process is repeated until there are no more bytes to read from the file.

You will notice that the array produced by ToArray is much larger than your buffer size.

Jason
  • 1,505
  • 5
  • 9
0

As already mentioned in the comments. The function read of Picture stream only reads a chunk of data, actually exactly the amount which the transport buffer has. The we read this amount we write it to the output stream from the transport buffer.

I tried to write some code snipped to demonstrates what is going on:

    int inputBufferSizeInByte = 1024 * 1000 * 5;  // 5 MiB = 5000 KiB
    //                    AmountKiloByte * factor MiB * factorWhatWeWant
    Byte[] inputBuffer = new Byte[inputBufferSizeInByte];
    //we fill our inputBuffer with random numbers
    Random rnd = new Random();
    rnd.NextBytes(inputBuffer);

    //we define our streams
    MemoryStream inputMemoryStream = new MemoryStream(inputBuffer);
    MemoryStream outPutMemoryStream = new MemoryStream();

    //we define a smaller buffer for reading
    int transportBufferSizeInByte = 1024 * 16; // 16 KiB
    byte[] transportBufferFor = new byte[transportBufferSizeInByte]; 

    int amountTotalWeReadInByte = 0;
    int tempReadAmountInByte = 0;
    int callWriteCounter = 0;
    do
    {
        tempReadAmountInByte = inputMemoryStream.Read(transportBufferFor, 0, transportBufferSizeInByte);

        //we write what we got to the output
        if(tempReadAmountInByte>0)
        {
            outPutMemoryStream.Write(transportBufferFor, 0, tempReadAmountInByte);
            callWriteCounter++;
        }

        //we calc how the total amout
        amountTotalWeReadInByte += tempReadAmountInByte;

    } while (tempReadAmountInByte > 0);

    //we sum up 
    Console.WriteLine("input buffer size:   \t" + inputBufferSizeInByte + " \t in Byte");
    Console.WriteLine("total amount read   \t" + amountTotalWeReadInByte  + " \t in Byte");
    Console.WriteLine("output stream size: \t" + outPutMemoryStream.Length + " \t in Byte");
    Console.WriteLine("called strean write \t" + callWriteCounter + "\t\t times");

output:

input buffer size:      5120000      in Byte
total amount read       5120000      in Byte
output stream size:     5120000      in Byte
called strean write     313      times

So we call 313 times the stream write function and everthing behaves like it should.

That's brings me to key question:

why is there in size difference between the picture in memory and in hard disk ?

I do think the picture encoding is the reason.

The difference of the size of a picture on the hard disk and its memory representation belongs often to the picture encoding. I know this fact from working with the cpp library opencv. I rather guess the c# implementation behaves similar.

See some Q/A about this topic: [question]: JPEG image memory byte size from OpenCV imread doesn't seem right

t2solve
  • 657
  • 4
  • 20
  • where is the file? I coped the code as is but I am not sure where to give reference to the uploaded file. – Jashvita May 31 '23 at 07:55