6

What's the most memory effective way to read an SQL 2005 image field using C# 3.5?

Right now I have a (byte[])cm.ExecuteScalar("...").

If I could not read all field content into memory, that would be nice.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
Rubens Farias
  • 57,174
  • 8
  • 131
  • 162

2 Answers2

4

See this excellent article here or this blog post for a long explanation how to do it.

Basically, you need to use a SqlDataReader and specify SequentialAccess to it when you create it - then you can read (or write) the BLOB from the database in chunks of whatever size is best for you.

Basically something like:

SqlDataReader myReader = getEmp.ExecuteReader(CommandBehavior.SequentialAccess);

while (myReader.Read())
{
   int startIndex = 0;

   // Read the bytes into outbyte[] and retain the number of bytes returned.
   retval = myReader.GetBytes(1, startIndex, outbyte, 0, bufferSize);

   // Continue reading and writing while there are bytes beyond the size of the buffer.
   while (retval == bufferSize)
   {
      // write the buffer to the output, e.g. a file
      ....

      // Reposition the start index to the end of the last buffer and fill the buffer.
      startIndex += bufferSize;
      retval = myReader.GetBytes(1, startIndex, outbyte, 0, bufferSize);
   }

   // write the last buffer to the output, e.g. a file
   ....
}

// Close the reader and the connection.
myReader.Close();

Marc

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • There are some important problems in that; you aren't incrementing the data-offset, and you aren't processing the final buffer if the last block isn't a complete buffer-load... – Marc Gravell Sep 28 '09 at 16:28
  • 1
    I think the data offset is being incremented (startIndex += bufferSize) - no? Yes - the last incomplete buffer is not shown here - this is an excerpt from the longer article I linked to, to illustrate the most important bits of the mechanism - it's **not** a complete working piece of code – marc_s Sep 28 '09 at 17:19
  • OK; "my bad" on the offset ;-p – Marc Gravell Sep 28 '09 at 20:18
  • Note to the unwary: Doing this WITHOUT setting CommandBehavior.SequentialAccess on the reader can lead to OutOfMemoryException, eg if extracting data in 1K increments from a 2MB file, totalling 2000 calls to GetBytes(). Looking at the implementation of GetBytes, it looks like it reads the entire Blob value into a an array every time before copying the requested bytes out to your array. Extracting a 2MB value in 1k chunks, your mamory rapidly fills up with temporary 2MB arrays! (2000 of them, unless the GC cleans them up in time) - hence the error. – Tao May 27 '11 at 13:06
2

The trick here is to use ExecuteReader in sequential mode, and read the data from the IDataReader. Here's a version for CLOBs - BLOBs are virtually identical, but with a byte[] and GetBytes(...).

Something like:

using (var reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess))
{
    byte[] buffer = new byte[8040]; // or some multiple (sql server page size)
    while (reader.Read()) // each row
    {
        long dataOffset = 0, read;
        while ((read = reader.GetBytes(
            colIndex, dataOffset, buffer, 0, buffer.Length)) > 0)
        {
            // TODO: process "read"-many bytes from "buffer"
            dataOffset += read;
        }
    }
}
Community
  • 1
  • 1
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Again, in case someone looks at this answer and missed my comment above: you MUST set CommandBehavior.SequentialAccess. If you don't it might LOOK like it's working for a while, but you'll be littering your process with temporary blob-sized byte arrays, and will eventually get OutOfMemoryException on larger blob extractions! – Tao May 27 '11 at 13:08
  • @Tao - yet in both answers we remembered to add that ;p – Marc Gravell May 27 '11 at 13:27
  • Yep, I just found some code in our app where someone had missed that - and it caused a big mess. I'm not criticizing the answers, just adding some more info useful to someone in my situation. This was one of my top search results, and the only one where I can add this hopefully-useful information for future reference. The reason I think it's important to put it out there is that without the "SequentialAccess" it LOOKS like it's working... And I guess the developer responsible was satisfied that it was implemented correctly. – Tao May 27 '11 at 14:37