2

sI have the following use case. I have a web service for downloading csv files.

Csv files are formed on the go, i.e., some data is retrieved from the database row by row and converted in the runtime into csv, the csv then, is put into MemoryStream that is returned by the web service for downloading. Even though the generation of the csv file is pretty fast, its size can grow over 2GB, and when its size does grow over 2GB OutOfMemoryException is thrown because MemoryStream can not handle so much data.

This is a test code snippet I wrote to better illustrate the problem:

//This is my WCF WebService
public class DownloadService : IDownloadService
{
    //This is the method for download csv
    public Stream DownloadFile()
    {
        var users = GetUsers();
        using (var memoryStream = new MemoryStream())
        using (var streamWriter = new StreamWriter(memoryStream))
        {
            foreach (var user in users)
            {
                var csvRow = $"{user.Id},{user.FirstName},{user.LastName}\n";
                streamWriter.Write(csvRow);  

                //When the size exceeds 2GB exception will be thrown here
                memoryStream.Flush();
            }
            WebOperationContext.Current.OutgoingResponse.Headers.Add("Content-Disposition", $"attachment; filename=Users.csv");
            WebOperationContext.Current.OutgoingResponse.ContentType = "application/csv";
            return memoryStream;
        }
    }

    //Method that returns Users from the database
    private IEnumerable<User> GetUsers()
    {
        string cmdQuery = "select Id, FirstName, LastName from User";
        using (var connection = new SqlConnection("some connection string"))
        using (SqlCommand cmd = new SqlCommand(cmdQuery, connection))
        {
            connection.Open();
            using (SqlDataReader reader = cmd.ExecuteReader())
            {
                while (reader.Read())
                {
                    yield return new User
                    {
                        Id = (int)reader["Id"],
                        FirstName = reader["FirstName"].ToString(),
                        LastName = reader["LastName"].ToString()
                    };
                }
            }
        }
    }
}

public class User
{
    public int Id { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

Is there any way I can bypass this limit, or maybe use another stream?

Please note the application is 32 bit and I would like not to use FileStream because, as I said, the generation is relatively fast, and if I use FileStream I will have to manage all the infrastructure to store and retrieve files, which in my opinion will be redundant, and may slow down the whole process too.

Mykhailo Seniutovych
  • 3,527
  • 4
  • 28
  • 50
  • 2
    You could probably use the response-stream directly. – johannes.colmsee Apr 20 '18 at 05:56
  • 2
    Just write directly to response stream, no need for any intermediates, such as files or memory streams. – Evk Apr 20 '18 at 06:19
  • It is not a `MemoryStream`, it is a process. See [OutOfMemoryException while populating MemoryStream: 256MB allocation on 16GB system](https://stackoverflow.com/questions/15595061/outofmemoryexception-while-populating-memorystream-256mb-allocation-on-16gb-sys), [How to return stream from WCF service?](https://stackoverflow.com/questions/11582409/how-to-return-stream-from-wcf-service) and [Best practices for streaming response of WCF webservice](https://stackoverflow.com/questions/2403002/best-practices-for-streaming-response-of-wcf-webservice) discussions on StackOverflow – Leonid Vasilev Apr 20 '18 at 06:41
  • @MickyD Please have a look at my post, I've added some sample code. – Mykhailo Seniutovych Apr 20 '18 at 06:50
  • Streams are useful for... ahem... streaming data where bytes flow through the system without regard to the amount of data. A `MemoryStream` allows you to buffer the content of an entire stream in memory. That is normally only useful if the amount of data is small. It seems that your system is built around data streaming in from a SQL Server and streaming out through an HTTP API. You don't need the `MemoryStream` in the middle. It will slow things down and in some cases require more memory than is available. – Martin Liversage Apr 20 '18 at 08:39

0 Answers0