1

I want to send the content of file as memory stream to S3 bucket via Amazon Firehose. below is my attempt which works fine for small files, but I have a file of 1 GB and I am getting {"Exception of type 'System.OutOfMemoryException' was thrown."}.

My code snippet:

[HttpPost]
public async Task<bool> Upload()
{
    try
    {
        var filesReadToProvider = await Request.Content.ReadAsMultipartAsync();
        foreach (var stream in filesReadToProvider.Contents)
        {
            var fileBytes = await stream.ReadAsByteArrayAsync(); // THIS IS WHERE EXCEPTION COMES
            using (MemoryStream memoryStream = new MemoryStream(fileBytes))
            {
                PutRecordRequest putRecord = new PutRecordRequest();
                putRecord.DeliveryStreamName = myStreamName;
                Record record = new Record();
                record.Data = memoryStream;
                putRecord.Record = record;
                await kinesisClient.PutRecordAsync(putRecord);
            }
        }
    }
    catch (Exception e)
    {
        Console.WriteLine(e);
        throw;
    }    
    return true;
}

I looked into this link OutOfMemoryExceptoin but I could not comprehend it. Please help me.

Attempt 1:

var filesReadToProvider = await Request.Content.ReadAsMultipartAsync();
foreach (var stream in filesReadToProvider.Contents)
{
    var fileByte = await stream.ReadAsStreamAsync();
    MemoryStream _ms = new MemoryStream();
    fileByte.CopyTo(_ms); // EXCEPTION HERE
    try
    {
        PutRecordRequest putRecord = new PutRecordRequest();
        putRecord.DeliveryStreamName = myStreamName;
        Record record = new Record();
        record.Data = _ms;
        putRecord.Record = record;
        await kinesisClient.PutRecordAsync(putRecord);
    }
    catch (Exception ex)
    {
        Console.WriteLine("Failed to send record to Kinesis. Exception: {0}", ex.Message);
    }
}
FortyTwo
  • 2,414
  • 3
  • 22
  • 33
Unbreakable
  • 7,776
  • 24
  • 90
  • 171
  • 2
    Why do you read the whole stream to an array? Use `ReadAsStreamAsync` if you want to stream bytes. – tkausl Feb 13 '19 at 15:58
  • 5
    Read it in chunks: https://stackoverflow.com/a/2161984/3110695 – FortyTwo Feb 13 '19 at 16:00
  • 1
    try to use `GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce; GC.Collect(); ` or if the file is big , divide it to chunks. – I_Al-thamary Feb 13 '19 at 16:01
  • The logic you apply to the file contents is relevant in this case. You should try to avoid reading all the file and then processing it, instead read and process in chunks. – EzLo Feb 13 '19 at 16:01
  • 1
    Go line by line... https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/file-system/how-to-read-a-text-file-one-line-at-a-time – Murray Foxcroft Feb 13 '19 at 16:01
  • It's a very large file. Once this article helps me: https://www.strathweb.com/2012/09/dealing-with-large-files-in-asp-net-web-api/ You should change IIS configuration, asp.net configuration do some extra things. – PWND Feb 13 '19 at 16:02
  • I added more code to my snippet: would need to traverse through `memory stream` and put it in Amazon s3 bucket via Amazon Kinesis firehose. – Unbreakable Feb 13 '19 at 16:04
  • @tkausl: I added your line of code and I did not get that exception but how do I pass it as a memory stream, Can you please see my code edit. – Unbreakable Feb 13 '19 at 16:34
  • @FortyTwo: I am a newbie, Can you kindly guide me on how I can incorporate the file chunking in my scenario. Do I completely need to delete all the lines of code I have written? – Unbreakable Feb 13 '19 at 16:39
  • @FortyTwo: I am sending file from postman. – Unbreakable Feb 13 '19 at 16:55
  • @tkausl: I was able to use your approach but when I tried to convert stream into memory stream I got the same exception. See my edit. – Unbreakable Feb 13 '19 at 17:09
  • @Unbreakable what is the data inside incoming stream? Is that text or binary? Can you read and push future just part of it or you need to consume it at once? Can you provide information about Record class/structure? – lerthe61 Feb 13 '19 at 17:12
  • @lerthe61: Record class structure. https://docs.aws.amazon.com/sdkfornet/latest/apidocs/items/TKinesisRecordNET45.html – Unbreakable Feb 13 '19 at 17:13
  • It has a Data property which only accepts MemoryStream – Unbreakable Feb 13 '19 at 17:13
  • @lerthe61: I don't need to consume everything at once. And data will be text. I basically need to send my app logs to S3 via firehose – Unbreakable Feb 13 '19 at 17:25
  • Let me try to explain why are you getting OutOfMemmoryException in your code: you are trying to get Stream data from POST request and put it into Amazon Kinesis Firehose. The Stream that you are getting from request is "lazy", this is only a wrapper which allows you to fetch data from the incoming request. But once you wrap it into MemoryStream (which is an implementation of Stream that host all of the data in memory) it will start fetching all data into your memory. And because you need to use AWS SDK you need to have enough memory to host all of that data in memmory. – lerthe61 Feb 13 '19 at 17:30
  • Oh, so if the stream contains text data, and you are ok to send that data in several chunks - that is doable. I am not familiar with Amazon Kinesis Firehose and how do you want to use it, but one of the solution can be: read portion of data, put it into MemoryStream, send that part to Amazon, repeat. – lerthe61 Feb 13 '19 at 17:31
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/188360/discussion-between-unbreakable-and-lerthe61). – Unbreakable Feb 13 '19 at 17:33

2 Answers2

0
[HttpPost]
public async Task<bool> Upload()
{
    try
    {
        using(var requestStream = await Request.Content.ReadAsStreamAsync())
        {
            PutRecordRequest putRecord = new PutRecordRequest();
            putRecord.DeliveryStreamName = myStreamName;
            Record record = new Record();
            record.Data = requestStream ;
            putRecord.Record = record;
            await kinesisClient.PutRecordAsync(putRecord);

        }
    }
    catch (Exception e)
    {
        Console.WriteLine(e);
        throw;
    }

    return true;
}

This will read the data in chunks. Keep everything in the Stream so you don't keep all the bytes around in a huge array.

Aron
  • 15,464
  • 3
  • 31
  • 64
  • Thank you so much, but it will not work, the `Record class` expects a `Memory Stream`. Please see this link https://docs.aws.amazon.com/sdkfornet/latest/apidocs/items/TKinesisRecordNET45.html – Unbreakable Feb 13 '19 at 17:16
  • And see my `Attempt 1 ` in my question. When I tried to convert stream into memory stream I again got the same exception – Unbreakable Feb 13 '19 at 17:17
  • Can you kindly tell me how can I read it in chunks. I think that is my only option. – Unbreakable Feb 13 '19 at 17:22
  • It looks like Amazon Kinesis records aren't meant to be bigger than 1 MB – Aron Feb 13 '19 at 17:24
-1

When reading large files, I use StreamReader's Readline() method. It works on large files as it manages file system caching internally. Can you use this method, instead? Is there a reason why you are implementing the MemoryStream class? You have a comment asking how to inject the data? Did you try using one of MemoryStream's methods???

https://learn.microsoft.com/en-us/dotnet/api/system.io.memorystream?view=netframework-4.7.2

Update:

Not sure if this is helpful since the code is substantially different from what you are using. But, yours isn't working either, so just a suggestion.

http://www.tugberkugurlu.com/archive/efficiently-streaming-large-http-responses-with-httpclient

J Weezy
  • 3,507
  • 3
  • 32
  • 88
  • 1
    Topic started does not mention data nature, and in case of binary data your advice of using RedLine may introduce more complication in processing. – lerthe61 Feb 13 '19 at 17:08
  • @lerthe61 OP just updated question to explain why they are using MemoryStream. Also, the inject comment was removed. – J Weezy Feb 13 '19 at 17:11