1

Using: SQL Server 2008, Entity Framework, WCF 4 REST

I have a table for holding the measurement data generated by monitoring system (aka the app). There are currently about 10 different well-known pieces of data being monitored within the app - and each corresponds to a column in the table. Each customer can "customize" each of their apps to capture from 1 to 10 pieces of data - they only need to capture those pieces of info they are interested in analyzing. Everything's working great (and performance is good) with this straight-forward fixed schema. This schema is designed to be multi-tenant, so multiple applications across multiple customers in multiple locations could be pumping data into the same DB - millions and millions of rows of measurement data (I wouldn't be surprised if we go to Azure before too long).

I have now been told that the measurement application will soon be able to monitor additional "things". This new list (so far I'm told is at around 150 items) could wind up being about 1000 items being measured. Add on top of that, the user could specify their own criteria for items to monitor/measure (i.e. custom measurements equating to custom columns.) The good news is that all the measurement data will be integers.

Now the fun - how do I design the schema for this situation? I would really like to keep the schema fixed. I would also like to keep it as performant as possible given the high volumes of data.

Any help is appreciated.

Current Schema:

CREATE TABLE MeasurementData (
    DataId bigint IDENTITY(1,1) NOT NULL PRIMARY KEY,
    ApplicationId int NOT NULL,    -- FK to Application table
    DateCollected datetime NOT NULL,
    Length int NULL,
    Width int NULL,
    Height int NULL,
    Color int NULL,
    Shape int NULL,
    Mass int NULL
)
CREATE TABLE Application (
    ApplicationId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
    CompanyId int NOT NULL,    -- FK to Company table
    SerialNumber nvarchar(50) NOT NULL
)
CREATE TABLE Company (
    CompanyId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
    CompanyName nvarchar(50) NOT NULL
)

And then we have the user table, the roles table, etc, where there's a 1-n relationship btw company and user.

FYI, the web app will then present the data with tables, graphs, etc (communicating via the REST web service layer)

Ed Sinek
  • 4,829
  • 10
  • 53
  • 81
  • You are going to have to switch to storing the data vertically. This will allow for an infinite amount of measures, and also help performance as you can reduce transmitted data to just the measurements the user subscribes to. – Malk Jan 22 '11 at 03:45

2 Answers2

2

Can you add two more tables? One with the types of measurements and the other with a mapping from the type to the measurement itself?

Basically A table with {DataId, DataMeasurementTypeId, DataValue} and {DataMeasurementTypeId, DataMeasurementType}

That should allow you to provide stored procedures to retrieve all Datameasurements in a table.

The better optiom might be to solve it with a Name,Value table and have the business object layer take care of constructing the right content.. That would fit (and likely perform) better with BigTable approach of Google than RDBMS though.

gbvb
  • 866
  • 5
  • 10
1

Take a look at these SO examples: one, two, three.

Community
  • 1
  • 1
Damir Sudarevic
  • 21,891
  • 3
  • 47
  • 71
  • Thanks @Damir, is this a variant on EAV - http://weblogs.sqlteam.com/davidm/articles/12117.aspx ? – Ed Sinek Jan 23 '11 at 01:35
  • Thanks @Damir. If i end up using an Observational Pattern/EAV, and since the data is all int based, that should simplify the schema dramatically. What's your take on the impact on the performance? Will I need to build an indexed view and/or data warehouse if I want decent performance once the measurement data reaches hundreds of millions of rows? – Ed Sinek Jan 23 '11 at 01:44
  • @Ed.S -- well, depends on required queries. You will at some point probably add some aggregation tables (daily, weekly..) and partition the raw-measurements-values table. – Damir Sudarevic Jan 23 '11 at 13:33