Background
We're probably going to use BigQuery to store our immutable business events so that we can replay them later to other services. I'm thinking that one approach would be to essentially just store each event as a blob (with some metadata). In order to replay them easily it would of course be nice to maintain a global order of our events and just persist each event to the same table in BigQuery. We probably have something like 10 events per second (which is nowhere near the limit of 100000 messages per second).
Question
- Would it be ok to simply persist all events in the same table?
- Would it perhaps be better to shard messages in different tables (perhaps based on event type, topic or date)?
- If (2), is it possible to join/scan through multiple tables sorted by time so that it's possible to replay events in the same order?