2

What would be the best data model to use in a metering application for example and app that captures daily water meter data.

What I can think of is that I could have a water_reading table that captures data for all users. Example:

2014-05-20 09:40 AM  -  user_1  -  0
2014-05-20 09:41 AM  -  user_1  -  4
2014-05-20 09:42 AM  -  user_1  -  9
2014-05-20 09:40 AM  -  user_2  -  0
2014-05-20 09:41 AM  -  user_2  -  2
2014-05-20 09:42 AM  -  user_2  -  7

Is that the best way to capture the data?

My main use for this data is to give the user the ability to view charts etc. in order to get usage patterns.

For example, a monthly water usage line chart.

Challenges

The only issue I can foresee with this model is that the readings need to be regular. If the user starts the data entry at a non-zero point, will it still be valid? Or if the user misses a few entries, how do I fill in the gaps etc.

For example: If a user entered the data the following data:

day_1 12pm - 20
day_2 12pm - 50
day_5 12pm - 120

How would I create a chart that shows daily water usage. Because as per the data:

day_1 : 20 - ? (I don't know if the meter was started the same day or the 5 days ago)
day_2 : 50 - 20
day_3 : ? - 50
day_4 : ? - ?
day_5 : 120 - ?

Any help would be much appreciated.

PS: I am using Ruby on Rails and database is postgresql.

  • 1
    I don't understand your problem. – Mike Szyndel May 20 '14 at 14:01
  • The problem is - how to create a data model that takes care of issues I can foresee in the challenges section of the question. – Pranay Joshi May 20 '14 at 14:06
  • Yeah, I don't understand those exactly. Why would it matter what was first reading or if any of those was skipped? – Mike Szyndel May 20 '14 at 14:07
  • I updated the question. Does that make sense? – Pranay Joshi May 20 '14 at 14:14
  • Unless it's uni assignment (and some grumpy prof will not like it) you should just use first measurement as a base one and show graph only from that day on. When it comes to data irregularity, since your values seems to be incrementing you can just average out the missing values (so when data is input for day 5 we can assume that usage per day was (120-50)/2 = 35 for day 4 and 5). Then you can take it further and correct for irregular times (if there was more than 24 hrs between measurements) – Mike Szyndel May 20 '14 at 14:38

2 Answers2

1

You probably want to think about separating the information about the users from the information about the water usage.

So maybe have a user model, a meter model, and a reading model.

Users can have many meters, meters can have many readings. Each reading has a date and a value.

This will minimise the amount of repeated data in your model. Making it easy to change things like the user name.

John C
  • 4,276
  • 2
  • 17
  • 28
  • I agree... But for the reading model itself, would you suggest a better way to capture data? One that takes care of the challenges I mentioned. – Pranay Joshi May 20 '14 at 14:04
  • 1
    The last thing you want to do is to build in a requirement such as the readings must be regular. Just use a decent graphing library and it won't be a problem. – John C May 20 '14 at 14:06
  • jPlot is the one I am inclined towards. Do you suggest anything else? – Pranay Joshi May 20 '14 at 14:07
  • 1
    [Highcharts](http://highcharts.com) are awesome. – John C May 20 '14 at 14:09
  • Why introduce additional complexity with meter model? Can one user have many measurements for same time? – Mike Szyndel May 20 '14 at 14:11
  • Yes, why not. Any person can own more than one house. Of course it all depends on the problem you are trying to solve. But why build in an assumption that could burn you later? Rails makes relationships easy to deal with. Also you can put the start time in the meter. – John C May 20 '14 at 14:13
  • Mostly because you have to take care of complexity, so you should avoid introducing it if not required. – Mike Szyndel May 20 '14 at 14:39
0

You should look at the kind of queries you need to perform and decide what model is going to work best for you as there are many options available. You may find that an array of nested composite types (ie an array of Readings in the Users table) does the job, but if you need to reference those readings from something other than the user or if you need indexing on certain values within an array of composite types then normalisation is your safest bet.

It sounds like you will be performing a lot of aggregation and time series generation queries so make sure the ORM you are using is up to the job. I've found ActiveRecord less than satisfactory for my needs in so many areas, and moved to Sequel which I have found to be a much more capable data access library, its postgres feature support is excellent and performance was also much better.

It is likely you will need to use window functions or recursive common table expressions and possibly generator functions for time series type queries (ie daily, monthly, quarterly usage). To get some idea of the SQL you will be dealing with to handle gaps in your water_readings, have a look at these stack overflow questions:

A guide for time series querying with postgresql:

You will be able to specify queries detailed above using Sequel with Postgres and get a result (ie a Dataset) which you can further chain/scope or compose within another query, but in ActiveRecord you would need to use raw SQL strings and get a result (an array) which you can't chain or scope.

Really take a good look at your reporting requirements, perhaps you need to not only report a users water usage by day/month/quarter, but also compare or rank with medians/averages/nth percentile etc of others in the same or similar demographic. How will you describe those kinds of queries so they work efficiently over time as dataset sizes increase? Will your data access layer/ORM work for you or is it more suited to simple CRUD type problems?

Community
  • 1
  • 1
Andrew Hacking
  • 6,296
  • 31
  • 37
  • Thanks for a comprehensive answer! There is a lot of good information here. I will look into Sequel and see if I can move to it in my case. Esp. since I am new to Rails and would't want to add too much customization to the standard suite. – Pranay Joshi May 20 '14 at 16:55
  • Just my 2c but don't be afraid to move beyond what 37 signals gives you in their opinionated framework. Yes there is a cost to learn, but its better than struggling with tools don't work well you. I started out with a traditional rails app (server rendered pages) but now I have moved to 100% JSON api, ditched sprockets, rails rendering pipeline and ActiveRecord, so the only real value I am getting out of rails is Rack compliance and the router/controller separation. Frameworks like Sinatra and Padrino are a better choice if you are doing modern web apps where your server is just serving JSON. – Andrew Hacking May 22 '14 at 00:52