2

I am trying to return a rather large dataset (several thousand rows) from a SQL Server 2016 database formatted as JSON (using the FOR JSON AUTO feature of SQL Server 2016); however, I am running into issues reading the results from the SqlDataReader.

I was previously loading the results into a DataTable as regular rows, and this works quite well (it takes approximately 10-15 seconds to load the entire table). However, if I try to build the string returned as JSON, using the same data, it is taking several minutes to build the string returned from the DataReader. I'd like to add, that I am also returning a substantial amount of Binary data in this query (I have SqlGeometry objects I am trying to retrieve from the database). My typical string returned from this is several 100k characters long and the reading from the reader is horrendously slow.

Anyway, my loading code for parsing the JSON is as follows:

public static Task<string> ExecuteJsonReaderAsync(string connectionString, CommandType commandType, string commandText, SqlParameter[] oParams = null, int timeout = 30, CancellationToken token = default(CancellationToken))
    {
        return Task<string>.Factory.StartNew(() =>
        {
            var str = string.Empty;
            using (var connection = new SqlConnection(connectionString))
            {
                connection.Open();
                using (var command = new SqlCommand(commandText, connection))
                {
                    command.CommandTimeout = timeout;
                    command.CommandType = commandType;

                    if (oParams?.Length > 0) command.Parameters.AddRange(oParams);

                    using (var reader = command.ExecuteReader(CommandBehavior.CloseConnection))
                    {
                        while (reader.Read())
                        {
                            str = $"{str}{reader[0]}";
                        }
                        reader.Close();
                    }    

                    return str;
                }
            }
        }, token);    
    }

I have tried various command options to attempt to speed it up ranging from the CloseConnection, SequentialAccess, SingleResult, but to no avail. Why is building the string taking so much longer than loading a DataTable from the same data, and is there any faster way to accomplish this?

I figure that has to be something that I am doing wrong or that I overlooked and I'm hoping that someone has ran across this issue before. Any ideas?

  • 1
    Use a StringBuilder instance – Steve Mar 14 '17 at 23:37
  • I think your problem might be `str = $"{str}{reader[0]}";`. Try using a `StringBuilder` and appending to it in the while loop. Otherwise a new string will be created each time (strings are immutable). – PeteGO Mar 14 '17 at 23:37
  • 1
    Just a remark not related to your main issue but don't use `Task.Factory.StartNew` when you can use an async/await all the way. Consider using `await ExecuteReaderAsync` – Cyril Durand Mar 14 '17 at 23:44
  • Cyril, the reason that was set up that way was b/c this code is used elsewhere to return a task that can be used via the await ExecuteReaderAsync. This was just the piece that was doing the work. – Nathan Raley Mar 15 '17 at 01:21
  • So is the string building what is causing it to take so long to iterate through the while (reader.Read()) loop? I wouldn't have thought using the immutable string would have caused it to take that long to process that loop. It would be nice if there was a quicker way to dump all the results into a string variable. By default, sql returns it in multiple rows if the json exceeds certain lengths. – Nathan Raley Mar 15 '17 at 01:23

2 Answers2

3

Your code reallocates the string variable in memory at each loop. This is detrimental to the performances of your code. Instead the class StringBuilder has an internal buffer that allows a lot less memory reallocations and you can also control the size of this buffer to avoid the reallocation to happen at all if you know the total length of your data.

So

// Set an initial capacity of 1MB
StringBuilder str = new StringBuidler(1024*1024);
while (reader.Read())
{
    str.Append(reader[0].ToString());
}

....
return str.ToString();

More about the C# string immutable concept here

Community
  • 1
  • 1
Steve
  • 213,761
  • 22
  • 232
  • 286
  • That's a nice point that I wasn't aware of, and thanks for bringing that up. However, the length of the data varies, so I could utilize some of that, but wouldn't be able to allocate the total size. Would I need to look into the serialization option to read out the data into sqlchars or sqlstring? There has to be a quicker way to dump all that data into a single string. – Nathan Raley Mar 15 '17 at 01:06
  • You don't need to allocate the total size. The docs explain that when the strings appended exceed the total size the class reallocates the buffer to be bigger (twice each time). The point is to avoid the constant reallocation of the string. Just prepare a starting buffer big enough to accomodate your first strings then let the class do its internal resizing – Steve Mar 15 '17 at 08:22
  • 1
    Yep, thanks a ton. That was it. The reallocation of the string for the number of iterations due to the length of the json returned must have been the problem. I just wasn't able to notice it on the smaller sets of data. Thanks again! – Nathan Raley Mar 15 '17 at 15:07
0

There are a couple of issues here:

  1. your use of string formatting via $"{str}{reader[0]}" is very performance heavy on the GC. You should be using a StringBuilder and the Append() method.
  2. you should really be using async all the way thru like so:

Here is how I would write the code for better performance (both scaling-wise via async and memory-wise via StringBuilder):

        public static async Task<string> ExecuteJsonReaderAsync(string connectionString, CommandType commandType, string commandText, SqlParameter[] oParams = null, int timeout = 30, CancellationToken token = default(CancellationToken))
        {
            var str = string.Empty;
            await using var connection = new SqlConnection(connectionString);

            await connection.OpenAsync(token);

            await using var command = new SqlCommand(commandText, connection);
            command.CommandTimeout = timeout;
            command.CommandType = commandType;

            if (oParams?.Length > 0) command.Parameters.AddRange(oParams);

            var stringBuilder = new StringBuilder(1024 * 1024);
            await using var reader = await command.ExecuteReaderAsync(CommandBehavior.CloseConnection, token);
            while (await reader.ReadAsync(token))
            {
                stringBuilder.Append(reader[0]);
            }
            await reader.CloseAsync();

            return stringBuilder.ToString();
        }
Dave Black
  • 7,305
  • 2
  • 52
  • 41