0

I’m developing a .NET 5, C# console application to create a CSV file from a list of custom objects, gzip it, and upload it to an Azure Storage container with this code:

var blobServiceClient = new BlobServiceClient("My connection string");
var containerClient = GetBlobContainerClient("My container name");
var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ";", Encoding = Encoding.UTF8 };

var list = new List<FakeModel>
{
   new FakeModel { Field1 = "A", Field2 = "B" },
   new FakeModel { Field1 = "C", Field2 = "D" }
};

await using var memoryStream = new MemoryStream();
await using var streamWriter = new StreamWriter(memoryStream);
await using var csvWriter = new CsvWriter(streamWriter, config);

await csvWriter.WriteRecordsAsync(list);

await using var zip = new GZipStream(memoryStream, CompressionMode.Compress, true);
await memoryStream.CopyToAsync(zip);

memoryStream.Seek(0, SeekOrigin.Begin);
var blockBlob = containerClient.GetBlockBlobClient("test.csv.gz");
await blockBlob.UploadAsync(memoryStream);

It appears to work, but when I download the gzip from the cloud to check it, I get the following error when trying to decompress it:

Error message from 7-Zip saying the archive is invalid

Inspection of the file shows it has a length of 0.

Can you help me understand why?

Andrew Morton
  • 24,203
  • 9
  • 60
  • 84
Pine Code
  • 2,466
  • 3
  • 18
  • 43

1 Answers1

2

The stream parameter of a new GZipStream is the destination stream. To process the input to the output, you need to write to the instance of GZipStream somehow.

When I experimented with it, I found that a call to csvWriter.FlushAsync() was necessary.

Like this:

class Program
{
    public class FakeModel
    {
        public string Field1 { get; set; }
        public string Field2 { get; set; }
    }

    static async Task Main(string[] args)
    {
        var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ";", Encoding = Encoding.UTF8 };

        var list = new List<FakeModel>
                        {
                           new FakeModel { Field1 = "A", Field2 = "B" },
                           new FakeModel { Field1 = "C", Field2 = "D" }
                        };

        await using var memoryStream = new MemoryStream();
        await using var streamWriter = new StreamWriter(memoryStream);
        await using var csvWriter = new CsvWriter(streamWriter, config);

        await csvWriter.WriteRecordsAsync(list);
        await csvWriter.FlushAsync();

        memoryStream.Position = 0;

        await using var fs = new FileStream(@"C:\temp\SO67935249.csv.gz", FileMode.Create);
        await using var zip = new GZipStream(fs, CompressionMode.Compress, true);
        await memoryStream.CopyToAsync(zip);

    }
}
Andrew Morton
  • 24,203
  • 9
  • 60
  • 84
  • I think the example could be simplified by writing to gzipStream directly, i.e. `new Streamwriter(new GzipStream(...))`. Also, `using var ...` will do the cleanup when the method ends, in this case it a scoped using statement might be better, I think that should avoid the need for explicit Flushing. – JonasH Jun 11 '21 at 11:45