0

I would like to create multiple files in AWS from a single local file without actually reading the file data into a buffer. Each file has:

  • destination name (in AWS)
  • offset (starting point of the local file)
  • size (how many bytes from the offset to upload)

For example:
Local file size is 1MB. File1 : "name1" , offset = 0 bytes, size = 100 bytes. File2 : "name2", offset = 200 bytes, size = 500 bytes .

The File 1 should be uploaded as independent file where its data will be 0 - 100 bytes of local file. The File 2 should be uploaded as second independent file where its data will be 200 - 700 bytes of local file.

I am using local Minio server as object storage server. My code successfully uploads a full content of local file:

    PutObjectRequest request;
    request.WithBucket("my_bucket_name").WithKey("file_path_in_aws");
    std::shared_ptr<Aws::IOStream> fileData = Aws::MakeShared<Aws::FStream>("some_alloc_tag", 
       "path_file_to_file_to_upload", std::ios_base::in | std::ios_base::binary);

    request.SetBody(fileData);        
    PutObjectOutcome outcome = m_pS3Client->PutObject(request);
    if (outcome.IsSuccess() == false)
        //print error message
        std::cout << outcome.GetError().GetMessage().c_str() 

I tried to upload by reading the local file into a buffer, then using Aws::StringStream to upload each part of the buffer as independent file. It worked correctly as well.

But I need to upload the local file as multiple files without reading the data so I tried :

    std::shared_ptr<Aws::IOStream> fileData = Aws::MakeShared<Aws::FStream>("some_alloc_tag", 
         "path_file_to_file_to_upload", std::ios_base::in | std::ios_base::binary);

    request.SetBody(data);
    request.SetContentLength(size1);
    data->seekg(offset1, std::ios_base::beg);

Call to Aws::S3::S3Client::PutObject fails. The failure message is: "Encountered network error when sending http request"

Minio server prints the error message :

Error: read tcp 127.0.0.1:9000->127.0.0.1:49811: wsarecv: An existing connection was forcibly closed by the remote host. (*net.OpError)
       5: cmd\fs-v1-helpers.go:323:cmd.fsCreateFile()
       4: cmd\fs-v1.go:1173:cmd.(*FSObjects).putObject()
       3: cmd\fs-v1.go:1089:cmd.(*FSObjects).PutObject()
       2: cmd\object-handlers.go:1631:cmd.objectAPIHandlers.PutObjectHandler()
       1: net\http\server.go:2069:http.HandlerFunc.ServeHTTP()

When I remove the content length change and the seekg :

    request.SetContentLength(size1);
    data->seekg(offset1, std::ios_base::beg);

then all works and full file is uploaded.

I found in aws-sdk-cpp github there is "UploadPartRequest" exist. I am not sure if this what is needed in my scenario. Just couldn't find a proper example that makes it clear how to use this "UploadPartRequest" or what is the purpose.

Is there anyone successfully uploaded part of a local file (with given offset and length) without reading file content, just using Aws::FStream (and changing file position 'seekg') ? or in another way ?

Maria B
  • 13
  • 4
  • After a quick search, here are some ideas. Look at [binary StringStream](https://stackoverflow.com/questions/48666549/upload-uint8-t-buffer-to-aws-s3-without-going-via-filesystem) options or roll your own [StreamBuffer](https://github.com/aws/aws-sdk-cpp/issues/533). – jarmod Dec 02 '21 at 02:27

1 Answers1

0

I am pretty sure UploadPartRequest is used for multipart uploads - usually within the Transfer Manager - where a file is split up into smaller chunks (called parts) and is uploaded to S3 where the original file is reconstructed from said parts. I don't think it will be useful in this case.

There is some documentation from the users guide (here) indicating you can pass a lambda functions to accomplish your task.