8

I am planning to create a merchant table, which will have store locations of the merchant. Most merchants are small businesses and they only have a few stores. However, there is the odd multi-chain/franchise who may have hundreds of locations.

What would be my solution if I want to put include location attributes within the merchant table? If I have to split it into multiple tables, how do I achieve that?

Thank you!

EDIT: How about splitting the table. To cater for the majority, say up to 5 locations I can place them inside the same table. But beyond 5, it will spill over to a normalised table with an indicator on the main table to say there are more than 5 locations. Any thoughts on how to achieve that?

Bluetoba
  • 885
  • 1
  • 9
  • 16

2 Answers2

11

You have a couple of options depending on your access patterns:

  • Compress the data and store the binary object in DynamoDB.
  • Store basic details in DynamoDB along with a link to S3 for the larger things. There's no transactional support across DynamoDB and S3 so there's a chance your data could become inconsistent.
  • Rather than embed location attributes, you could normalise your tables and put that data in a separate table with the equivalent of a foreign key to your merchant table. But, you may then need two queries to retrieve data for each merchant, which would count towards your throughput costs.
  • Catering for a spill-over table would have to be handled in the application code rather than at the database level: if (store_count > 5) then execute another query to retrieve more data

If you don't need the performance and scalability of DynamoDB, perhaps RDS is a better solution.

craigcaulfield
  • 3,381
  • 10
  • 32
  • 40
  • I'd like to explore binary compressions. Does DynamoDB query/write automatically decode/encode to text? what is the performance degradation? According to this https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_AttributeValue.html, it uses Base64. I am not well versed, but how much storage savings will I get? Whilst on this subject, will this be a good way to store email/chat messages in binary format? – Bluetoba May 07 '18 at 02:36
  • By the way, can you add a secondary index on binary type attributes? – Bluetoba May 07 '18 at 02:44
  • The GZIP and LZO compression algorithms produce a binary output suitable for a storing as a binary object. You might need to experiment with each algorithm to see which is best for your data. Check out https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-use-s3-too.html, which has more details and a complete example for Java and matches pretty closely your email/chat use case. – craigcaulfield May 07 '18 at 02:46
  • Scalar data types (Number, String, Binary, and Boolean) can be used for the sort key element of a local secondary index. – craigcaulfield May 07 '18 at 03:00
  • Thanks craigcaulfield. Just looking at this, https://stackoverflow.com/questions/4715415/base64-what-is-the-worst-possible-increase-in-space-usage it appears encoding it into binary increases the size 4/3 of the original size. Why would people call this process compression? – Bluetoba May 07 '18 at 03:05
  • 1
    That link refers to base64. You'll likely get better compression with one of the other algorithms. – craigcaulfield May 07 '18 at 03:19
0

A bit late to the party, but I believe the right schema would be to have partitionKey as merchantId with sortKey as storeId. This would create individual, separate records for each store and you can store the geo location. This way

  • You would not cross the 400KB threshold
  • Queries become efficient if you want to fetch the location for just 1 of the stores of the merchant. If you want to fetch all the stores, there is no impact with this schema.

PS : I am a Software Engineer working on Amazon Dynamodb.

Nishit
  • 1,276
  • 2
  • 11
  • 25