4

Is there much difference (in time performance) in using BatchGetItem vs issuing several GetItem in parallel?

My code will be cleaner if I can use GetItem and just handle the parallelisation myself.

However, if there's a definite time performance advantage to BatchGetItem then I'd certainly use that.

Lawrence Wagerfield
  • 6,471
  • 5
  • 42
  • 84

1 Answers1

0

BatchGetItem already does its work in parallel:

In order to minimize response latency, BatchGetItem retrieves items in parallel.

Although I don't have benchmarks for you, a single BatchGetItem can get up to 100 items in parallel. Also BatchGetItem is single API call. Thus, performing one API call to get 100 items should be much faster then doing 100 individual API calls using GetItem due to just network latency.

Marcin
  • 215,873
  • 14
  • 235
  • 294
  • "Thus, performing one API call to get 100 items should be much faster then doing 100 individual API calls using GetItem due to just network latency." Not if those 100 individual API calls are all in parallel, theoretically. The time of the operation is simply MAX(all_request_times). As you've said: BatchGetItem does its work in parallel, but then so will my parallel GetItem requests. Therefore, the question is: is BatchGetItem consistently and measurably faster than manually performing GetItem in parallel? As I have mentioned: I'm doing this as in my case, it makes my code cleaner. – Lawrence Wagerfield Dec 01 '20 at 23:24
  • 1
    @LawrenceWagerfield I don't have numbers for you. But this is something that can be benchmarked rather easily on your existing database. There are also other factors to consider. The 100 calls in parallel will put stress on your own servers/app. `BatchGetItem` does parallel work on AWS servers which reduces stress on your own computational and memory resources. – Marcin Dec 01 '20 at 23:34
  • 1
    Yes is true re. the extra stress on the client machine making the requests. I hadn't specified this, but I will only be making 2 requests, and the client machine is a Lambda function (so I don't need to worry about servicing many clients in parallel, as Lambda serializes all access to Lambdas / doesn't reuse the same Lambda instance to service multiple requests at once), meaning the extra resource usage doesn't really concern me: only execution time does. I suppose the extra HTTP call will increase memory usage slightly, which does factor into it... but it's only 2 requests :) – Lawrence Wagerfield Dec 01 '20 at 23:41
  • 2
    @LawrenceWagerfield If its just 2 requests, then probably there is not much benefit with `BatchGetItem` to begin with and if its easier to program and manage, then I would stick with `GetItem`. – Marcin Dec 01 '20 at 23:44