0

I'm using AWS Java SDK in Apache Spark job to populate DynamoDB table with data extracted from S3. Spark job just writes data using single PutItems with very intense flow (three m3.xlarge nodes used only to write) and without any retry policy.

DynamoDB docs state that AWS SDK has backoff policy, but eventually if rate is too high ProvisionedThroughputExceededException can be raised. My spark job worked for three days and was constrained only by DynamoDB thoughput (equal 500 units) so I expect rate was extremely high and queue was extremely long, however I didn't have any signs of thrown exceptions or lost data.

So, my question is - when it is possible to get an exception when writing to DynamoDB with very high rate.

chuwy
  • 6,310
  • 4
  • 20
  • 29

1 Answers1

4

You can also get throughput exception if you have a hot partition. Because throughput is divided between partitions, each partition has a lower limit than total provisioned throughput, so if you write to the same partition often, you can hit limit even if you are not using full provisioned throughput.

Another thing to consider is that DynamoDB does accumulate unused throughput and use it to burst throughput available for short duration if you go above your limit briefly.

Edit: DynamoDB now has new adaptive capacity feature which somewhat solves the problem of hot partitions by redistributing total throughput unequally.

Tofig Hasanov
  • 3,303
  • 10
  • 51
  • 81
  • I believe it's very unlikely for me get hot partition as keys (partition and sort) are very evenly distributed (and both unique in 99.9%). Still I'm very surprised as didn't get exception for that long and for that huge queue. – chuwy Jul 20 '17 at 05:48
  • @chuwy Did you check metrics for DynamoDB? You can see what write throughput was actually used there and if there were any errors. Perhaps your implementation has some bottleneck that results in relatively low rate of write (ex: you are making sync calls sequentially, instead of async parallel calls). Can't say much without looking at the code. – Tofig Hasanov Jul 20 '17 at 06:17
  • Thanks, @TofigHasanov this is already useful. Indeed this is few parallel streams of sync sequential requests as it doesn't really make sense to use async puts from Spark. Consumed write capacity in metrics looks aligned with provisioned (1000). Throttled write requests always on level of 400-600. – chuwy Jul 20 '17 at 09:02