Background about what I'm trying to achieve
I'm implementing in app purchase verification for an app. In theory the following steps should be performed to avoid delivering the same purchase to the user multiple times:
- The app calls a Firebase Cloud Function taking the purchase verification data as parameters.
- The Cloud Function calls the corresponding Google server or Apple server to check that the purchase is valid.
- The Cloud Function checks that the purchase was not yet delivered to the user.
- The Cloud Function delivers the purchase to the user.
For implementing step 3, I thought of using a Firestore collection which stores information about all purchases that were already delivered. It was supposed to be structured like this:
Firestore root
|
|-> purchases (Collection)
|
|-> {purchaseID} (Document)
|
|-> sku: String (What was bought?)
|-> uid: String (Who bought it?)
purchaseID was supposed to be the either the order ID (for in app purchases made on the Android app) or the transaction ID (for in app purchases made on the iOS app). In case you are not familiar with those IDs: Googles order IDs are something like GPA.1234-5678-9012-34567
while Apples transaction IDs are formatted like 1234567890123456
.
During testing I noticed that order IDs are seemingly random (i.e. they are neither monotonically ascending nor monotonically descending), but transaction IDs appear to be monotonically ascending. According to the Best practices and Limits of Firestore as well as this answer to another Stackoverflow question, it is an antipattern to use monotonically increasing IDs as this would cause issues when there are more than 500 writes per second. While I don't expect to ever reach 500 in app purchases per second (I certainly doubt there is any app to reach such a high rate of purchases), I still would like to avoid antipatterns.
Further elaboration on why I need to store these monotonically increasing IDs
In order to not deliver the same in app product multiple times to the user, I need to keep track of the purchases that were already delivered. Let me try to explain that by sharing more insights about the app we're developing: The app lets users buy 5, 10 or 20 additional support tickets which allows them to ask the corresponding amount of questions. To achieve this, the user document of each user features an integer representing how many remaining questions that user may ask.
Let's assume a user using an iOS device buys 10 tickets. The app uses the verification data (that's basically a very long base64 encoded string) of the in app purchase to call a Firebase Cloud Function to verify and deliver the purchase to that user. The Cloud Function starts of by calling an Apple server to check whether that verification data is valid. The Apple server responds with a JSON object containing the purchase details for transaction ID 23 (IDs are shortened in this example for better readability). After that, the Cloud Function needs to determine whether the 10 tickets related to transaction ID 23 were already delivered. If these 10 tickets were not yet delivered, we update the users document to reflect that this user is now allowed to ask 10 additional questions. Additionally we need to keep track that transaction ID 23 is now delivered. To ensure that all this is performed atomically, that aforementioned check and the update are performed in one transaction.
Let's now assume that later on the same user buys another 10 tickets. Again, the app will call the Cloud Function to verify and deliver the purchase. This time, the Apple server responds with a JSON object containing details about transaction ID 23 (which is the old purchase that was already delivered) and transaction ID 36 (which is the new purchase). In order to correctly add 10 tickets (instead of 20) to the users counter, we need to detect that the purchase with transaction ID 23 was already delivered to the user, hence the need to somehow store the transaction ID.
You may now wonder, if delivering only the newest purchase (i.e. the purchase with the highest transaction ID) would be a feasible solution. Sadly, this could cause some issues where purchases might not be delivered correctly. Assume the user completely exits the app (i.e. the app is not running in the background anymore), buys 15 tickets (by purchasing 10 and 5 tickets) directly via the App Store and then opens our app again. The app will now be noticed about the two purchases and call the Cloud Function twice. During both calls, the cloud function will deliver the newest purchase which is the bundle of 5 tickets. Instead of receiving 15 additional tickets, the user only received 10 tickets.
Generalized question and possible (non)solutions I thought of
Based on my specific case, a generalized question arises: How is one supposed to store monotonically increasing IDs that are generated from an external source?
Here are a few thoughts and ideas I came up with:
- Use IDs generated by Firestore, store the external ID as a field in the document and use sharding as explained in sharded timestamps. While this is a good scalable solution for most use cases, it might not work for everyone. Take my use case as an example: In order to avoid delivering the purchase to the user multiple times, step 3 (checking if the purchase is already delivered) and step 4 (delivering the purchase) must be performed in one transaction. However, determining whether there is a document with a specific external ID requires n queries (where n is the amount of shards), which is not possible inside a transaction.
- Permutate the digit values of the external ID and use that permutated ID as the document ID. For example each 1 might be permutated to an 8, each 2 to a 4, each 3 to a 7 and so on. While this would result in neighboring IDs being spaced slightly further apart, it won't help against hotspotting because similar IDs aren't being scattered very far apart. Furthermore, this is not an option if you need to order on the external ID (which is not needed in my use case).
- Use the reversed external ID as the document ID. This would cause the least significant digit to become the most significant digit. While I'm certain this would help at least a bit (by using a decimal system this should at least result in something similar to using 10 shards), I'm not sure whether this would scale infinitely. Similar to my second thought, this is not a solution if you need to order on the external ID, but it would be sufficient in my use case.
- Because similar external IDs should result in completely different document IDs, one could use a perfect (i.e. collision free) hashing function. However, finding a perfect hashing function for a specific use case might be difficult. Up to a certain length of the external ID, a cryptographic hashing function such as SHA-256 or SHA-512 could also be used. This idea feels way overengineered to me. There must be a simpler solution. Again, this won't be a solution if you need to order on the external ID or if you need to infer the external ID from the document ID.
- Similar to the idea of using a hash function, one could also use an encryption algorithm such as AES to scatter the external IDs. Again, this feels way overengineered and is not a solution if you need to order on the external ID.
I'm certain someone must have already stumbled upon this problem (especially because there are so many apps featuring in app purchases). How did you solve this? Am I just overengineering my data structure for the sake of avoiding antipatterns? Or is there some simple solution which I'm totally missing?