My code captures numeric values from web services. The captured values are generally increasing, but sometimes they decrease. For now, I just do a dumb capture of what I have observed at a specific time... this is what my table structure looks like now:
id date_collected value1_observed value2_observed
1 2012-01-26 1:00am 500 12
2 2012-01-26 1:01am 90000 NULL
3 2012-01-26 1:02am 100 50
1 2012-01-26 7:23am 502 12
2 2012-01-26 7:24am 90105 NULL
3 2012-01-26 8:09am 97 55
You can see that...
- ID=1 value1 increased by 2 in 6 hours 23 minutes
- ID=2 value1 increased by 105 in 6 hours 23 minutes
- ID=3 value1 decreased by 3 in 7 hours 8 minutes
Difficulties:
- The collection time is inconsistent (I collect multiple times per day with a varying length of time between collections)
- the values can increase or decrease or be null
I have captured tens of millions of rows of these records and I'd like to be able to query them more easily, so I'm thinking I need to transform this data into a schema that is better suited for the questions I want to answer. I'd like to be able to find patterns like:
- "Between the hours of 1am and 5am, these IDs had the largest changes in value1".
- "Over the last 30 days, which IDs have had value2 grow the most (in absolute and relative terms)?"
Are there any suggestions on how I should structure the schema? Would it be possible to transform this data to a traditional star schema?
UPDATE to answer questions:
- I am currently using Microsoft SQL Server, but am open to using MySQL or other open source options.
- If id 1 changes from 50 to 100 to 52, I would like to know that it went +50 and then -48. That way I could see fluctuation and net gain/loss over time. For example, if I were tracking registered users on a site... generally that numbers always goes up, but if people delete their accounts, I would want to identify that trend. However, if my app didn't observe the change from 50 to 100 and only recorded 50 and 52, it would not know that the number ever hit 100.
Thanks for any insight/direction,
-Paul