2

I'm fairly new to Swift and very new to NIO.

I'm adding Swift code to a large project that needs to up/down load a lot of data (GBs) to AWS. To that end, I've imported the GitHub project Soto, which relies heavily on NIO.

Most methods that send/receive data do so through ByteBuffer structs. My application already has the data to upload in Foundation Data objects. I'm having trouble figuring out the best way to get these Data objects into NIO.

In the documentation for NIO's ByteBuffer (2.26.0) it states

Supported types: A variety of types can be read/written from/to a ByteBuffer. ... Out of the box, ByteBuffer supports for example the following types (non-exhaustive list):

  • String/StaticString
  • Swift’s various (unsigned) integer types
  • Foundation‘s Data
  • [UInt8] and generally any Collection of UInt8

However, the latest swift-nil package has no ByteBuffer support for Foundation Data objects. Instead, it supports DispatchData objects, which in turn seem to have no interoperability with Data objects.

What I want to avoid is making a copy of every block of data (100's of MB at a time), just to convert between Data and DispatchData types.

So...

Right now my thinking is one of

  • I'm completely lost, and there's a simple solution I haven't found

  • The solution is to create a subclass of DispatchData backed by a Data object

  • Initialize the ByteBuffer structure using a DispatchData created using the no-copy initializer pointing to the raw byte array in the Data object, along with a custom deallocator that simply retains the Data object until the ByteBuffer and DispatchData objects are destroyed.

I would appreciate any thoughts, experience, or suggestions (particularly if it's option #1).

adamfowlerphoto
  • 2,708
  • 1
  • 11
  • 24
James Bucanek
  • 3,299
  • 3
  • 14
  • 30

2 Answers2

1

You'll need to import NIOFoundationCompat to get any of NIO's method that work with Foundation data types such as Data (or JSONDecoder/JSONEncoder). NIOFoundationCompat is just another module of the swift-nio package so you won't need another dependency.

But just to be clear, under the hood, there will always be copies but probably you don't need to worry about them, copies are extremely fast on today's CPUs. If you absolutely want to avoid copies, you'll need to create ByteBuffers straight away. To help you with that, you may want to add where you get your data from that you want to send over the network.

Johannes Weiss
  • 52,533
  • 16
  • 102
  • 136
  • Thanks, Johannes! I knew the answer was #1. ;) On an aside, I wasn't worried about speed as much as memory usage. I could have as many as 4 upload operations running simultaneously, each holding on to as much as 256 MB of data in Data objects. Having to duplicate them all means there could be a GB of RAM that's nothing more than a copy of another GB of RAM. That seems excessive to me. I'll look into refactoring my code to see if I can create the ByteBuffers earlier—and discard the Data objects. – James Bucanek Mar 08 '21 at 19:03
1

If you are concerned about memory usage and are uploading large buffers perhaps you should be using AWSPayload.stream. This allows you to stream small ByteBuffers to AWS. Here is an example of streaming Data to S3 in 16k chunks

func uploadData( _ data: Data) -> EventLoopFuture<S3.PutObjectOutput> {
    var index = 0
    let payload = AWSPayload.stream { eventLoop in
        let maxChunkSize = 16*1024
        let size = min(maxChunkSize, data.count - index)
        // are we done yet
        if size == 0 {
            return eventLoop.makeSucceededFuture(.end)
        } else {
            // create bytebuffer and return
            let byteBuffer = ByteBufferAllocator().buffer(data: data[index..<(index+size)])
            index += size
            return eventLoop.makeSucceededFuture(.byteBuffer(byteBuffer))
        }
    }
    let putRequest = S3.PutObjectRequest(body: payload, bucket: name, key: "tempfile")
    return s3.putObject(putRequest)
}

adamfowlerphoto
  • 2,708
  • 1
  • 11
  • 24
  • I guess great minds think alike! I actually ended up using a stream. I decided to do the conversion on the front end: My individual `Data` buffers get converted into a FIFO of ByteBuffers (allowing me to release the `Data` objects early and merge tiny blocks). At delivery time, the only thing my payload stream block has to do is `return popBuffer()`. – James Bucanek Apr 15 '21 at 15:59