3

I'm currently using Node.js and Firebase on a project, and I love both. My challenge is that I need to store millions of sales order rows that would look something like this:

{ companyKey: 'xxx', orderKey : 'xxx', rowKey : 'xxx', itemKey : 'xxx', orderQty: '5', orderDate: '12/02/2015' }

I'd like to query these records like the pseudocode below:

Select sum(orderQty) from mydb where companyKey = 'xxx' and itemKey = 'xxx' groupby orderDate

According to various reasons such as Firebase count group by, groupby in general can be a tough nut to crack. I've done it before using Oracle Materialized Views but would like to use some kind of service that just does all of that backend work for me so I can CRUD those sales orders without worrying about the aggregation maintenance. I read in another stackoverflow post that Keen.io might be a good approach to this problem.

How would the internet experts attack this problem if they were using a JavaScript heavy stack and they wanted an outside service to do aggregation by day for them?

A couple of points I'm considering. I'll update as they come up:

1) It seems I might have to take Keen.io off the list. It's $125 for 1M rows. I don't need all the power Keen.io provides, only aggregation by day.

2) Going the Sequelize + PostGreSQL seems to be a decent compromise. I can still use JavaScript, an ORM to alleviate the pain, and PostGreSQL hosting is usually cheap.

Community
  • 1
  • 1
Will Lovett
  • 1,241
  • 3
  • 18
  • 35
  • I think it's too general of a question? Basically the question is; what is the best option for supporting aggregations queries like X on a large dynamic database, available from node.js, and oh, also has a nice low impedance API? – spinkus Feb 27 '16 at 02:32
  • This query is pretty basic but you'll need to alter your structure to compensate for the 'and' portion of the Firebase query. What are the expected results of the 'groupby'? Are you needing analytics on the resulting query (like a count by date?) instead of the query results? Or both? – Jay Feb 27 '16 at 17:33
  • Looks like update pricing will make in $10 for 1M rows: keen.io/pricing – jandwiches Mar 02 '17 at 19:38

1 Answers1

0

It sounds like you want to show a trend in sales of an item over time. That's a very good fit for an event data platform because showing trends over time is really native to the query language. In Keen IO, the idea of "grouping by time" is instead expressed as the concept of "timeframe" (e.g. previous_7_days) and "interval" (e.g. daily).

Here's how you would run that with a simple sum query in Keen:

var sum = new Keen.Query("sum", {
  event_collection: "sales",
  target_property: "orderQty",
  timeframe: "previous_12_weeks",
  interval: "weekly",
  filters: [
    {
      property_name: "companyKey",
      operator: "eq",
      property_value: "xxx"
    },
    {
      property_name: "itemKey",
      operator: "eq",
      property_value: "yyy"
    }
  ]
});

In fact you could calculate the sum for ALL of your companies and products in a single query by using group_by.

var sum = new Keen.Query("sum", {
  event_collection: "sales",
  target_property: "orderQty",
  timeframe: "previous_12_weeks",
  interval: "weekly",
  group_by: ["companyKey", "itemKey"]
});

Keen recently updated their pricing. Depending on the frequency of querying, something like this would be pretty light, in the $10s of dollars per month if you have millions of new transactions monthly.

Michelle Wetzler
  • 734
  • 4
  • 16