My problem
In my logs I've seen some System.OutOfMemoryException errors logged during load tests. They're coming from a Dapper query. Based on the stacktrace, the error is happening when the list is resized.
Top of stacktrace
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. at System.Collections.Generic.List
1.set_Capacity(Int32 value) at System.Collections.Generic.List
1.AddWithResize(T item) at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in C:\projects\dapper\Dapper\SqlMapper.Async.cs:line 442
The code being invoked
const string sql = $@"
SELECT [Column 1]
,[Column 2]
,[Column 3]
,[Column 4]
,[Column 5]
,[Column 6]
,[Column 7]
,[Column 8]
,[Column 9]
,[Column 10]
,[Column 11]
,[Column 12]
FROM [MyDb].[MyTable]
WHERE co = @CompanyId AND process = @Process
";
await using var connection = _dbConnectionProvider.Create(DbKey.MyDb);
var parameters = new { CompanyId = CompanyDbString(companyId), Process = process };
var command = new CommandDefinition(sql, parameters);
var results = await connection.QueryAsync<MyEntity>(command); // error happens here
return results;
Some info about the data being loaded
I looked at the database, and saw that the data in question is 141,846 rows in size. The columns are nothing "out of the ordinary" (no giant text blobs or anything like that), just some reasonable varchars, datetimes, ints, etc. All the varchars max out at 100 chars.
And just to nip the inevitable "Do you really need to load all that data?" comment in the bud: Yes I do.
Possible solution?
So, the data being loaded is a pretty large dataset, but nothing too crazy. I was surprised it ran out of memory. What I find particularly interesting is that the exception stemmed from System.Collections.Generic.List1.set_Capacity(Int32 value)
. Based on that, it looks like the error happened while the list was resizing its internal array.
My first thought on how to approach this was to page the result, kind of how its suggested here: https://stackoverflow.com/a/52212051
However, I'm not sure if that'll work since it still relies on a list being added to. Is that a valid concern? If so, would getting the count first and starting the list with that as the initial capacity be a good approach?
Or is there maybe a different collection that is more resize/memory friendly than List?