Azure doesn't allow me to unzip a file directly in a container. I downloaded a zip file and now need to expand the file in the zip file. What I get is a 0 byte file. I can download the zip to my local computer and see the embedded csv, so the zip file isn't corrupt. I get no errors, just a zero byte output file. What am I doing wrong? I've tried all of these options unsuccessfully:
using (MemoryStream ms = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(ms);
using (var zipStream = new GZipStream(ms, CompressionMode.Decompress))
{
CloudBlockBlob unzippedBlob = container.GetBlockBlobReference(String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv");
unzippedBlob.Properties.ContentType = "text/csv";
using (Stream outputFileStream = await unzippedBlob.OpenWriteAsync())
{
await zipStream.CopyToAsync(outputFileStream);
outputFileStream.Flush();
}
}
}
2nd try:
using (MemoryStream ms = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(ms);
using (var zipStream = new GZipStream(ms, CompressionMode.Decompress))
{
CloudBlockBlob unzippedBlob = container.GetBlockBlobReference(String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv");
unzippedBlob.Properties.ContentType = "text/csv";
await unzippedBlob.UploadFromStreamAsync(zipStream);
}
}
3rd
using (MemoryStream ms = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(ms);
using (var zipStream = new GZipStream(ms, CompressionMode.Decompress))
{
CloudBlockBlob unzippedBlob = container.GetBlockBlobReference(String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv");
unzippedBlob.Properties.ContentType = "text/csv";
using (Stream outputFileStream = await unzippedBlob.OpenWriteAsync())
{
await zipStream.CopyToAsync(outputFileStream);
outputFileStream.Flush();
}
}
}
4th
using (MemoryStream ms = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(ms);
using (DeflateStream decompressionStream = new DeflateStream(ms, CompressionMode.Decompress))
{
CloudBlockBlob unzippedBlob = container.GetBlockBlobReference(String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv");
unzippedBlob.Properties.ContentType = "text/csv";
using (Stream outputFileStream = await unzippedBlob.OpenWriteAsync())
{
await decompressionStream.CopyToAsync(outputFileStream);
outputFileStream.Flush();
}
}
}
5th
using (var inputStream = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(inputStream);
inputStream.Seek(0, SeekOrigin.Begin);
using (var gzStream = new GZipStream(inputStream, CompressionMode.Decompress))
{
using (var outputStream = new MemoryStream())
{
gzStream.CopyTo(outputStream);
byte[] outputBytes = outputStream.ToArray(); // No data. Sad panda. :'(
string output = Encoding.ASCII.GetString(outputBytes);
CloudBlockBlob unzippedBlob = container.GetBlockBlobReference(String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv");
unzippedBlob.Properties.ContentType = "text/csv";
await unzippedBlob.UploadTextAsync(output);
}
}
}
6th
using (var ms = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(ms);
ms.Seek(0, SeekOrigin.Begin);
using (DeflateStream decompressionStream = new DeflateStream(ms, CompressionMode.Decompress))
{
using (var outputStream = new MemoryStream())
{
decompressionStream.CopyTo(outputStream);
byte[] outputBytes = outputStream.ToArray(); // No data. Sad panda. :'(
string output = Encoding.ASCII.GetString(outputBytes);
CloudBlockBlob unzippedBlob = container.GetBlockBlobReference(String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv");
unzippedBlob.Properties.ContentType = "text/csv";
await unzippedBlob.UploadTextAsync(output);
}
}
}
Option 5 and 6 fail with this error message on the CopyTo method:
System.Private.CoreLib: Exception while executing function: DefinitiveHealthCare. System.IO.Compression: The archive entry was compressed using an unsupported compression method.
How is this done?
Not sure if this would every be searching via Google, since closed as a duplicate, but I do think my solution would help someone in the future:
private static async Task UnzipDefinitiveFile(string fileName, CloudBlobContainer container, Logger logger, DateTime lastWriteTime, CloudBlockBlob zipOutputBlob)
{
using (MemoryStream blobMemStream = new MemoryStream())
{
await zipOutputBlob.DownloadToStreamAsync(blobMemStream);
using (ZipArchive archive = new ZipArchive(blobMemStream))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
logger.Send(SeverityLevel.Verbose, $"Now processing {entry.FullName}");
if (entry.FullName != Path.GetFileNameWithoutExtension(fileName) + ".csv")
continue;
string validName = String.Format("{0:yyyy-MM-dd}", lastWriteTime) + " " + Path.GetFileNameWithoutExtension(fileName) + ".csv";
CloudBlockBlob blockBlob = container.GetBlockBlobReference(validName);
blockBlob.Properties.ContentType = "text/csv";
using (var fileStream = entry.Open())
{
await blockBlob.UploadFromStreamAsync(fileStream);
}
}
}
}
}
The change here is using the ZipArchive class and looping through all the elements in the ZipArchive. In my case there was only one file in the ZipArchive, so I was trying to skip this step which did not work out well.
Also another option could be to just use Azure Data Factory V2. The copy data activity can load a zipped file and it can download a file from an SFTP location.