I am designing a system to record and report on daily measurement data. The data consists of a category identifier, the date/time, and the measurement data (of which there can be up to 500 pieces either as float or int).
The categories are visualized as a tree structure where data is associated with a node as well as with a leaf.
The raw data comes in as a CSV in the following format:
1/6/2001 15:55, /Node1/Node2/Node3, 121, 34, 452, 651, 167
1/6/2001 15:55, /Node1/Node2/Node3/LeafA, 12, 34, 45, 65, 67
1/6/2001 15:55, /Node1/Node4/Node5/LeafB, 21, 32, 43, 54, 65
I am planning on using Adjacency List (see Database Structure for Tree Data Structure) for the tree structure. I am also planning to have a second table just for the measurement data and the date/time. This way, once the tree structure is generated the first time, it can be referenced over and over again by the measurement data table. Also, having a small Adjacency List table makes the system much more readable :). In the Category table below, Name would be a node or leaf name (e.g. Node1 or LeafA) and FullName would be the entire branch path (e.g. Node1/Node2/Node3/LeafA). Not sure I need both, but I think they will come in handy so I don’t have to recreate the FullName when needed.
CREATE TABLE [dbo].[Category](
[CatId] [int] IDENTITY(1,1) NOT NULL,
[ParentCatId] [int] NULL,
[Name] [nvarchar](30) NOT NULL,
[FullName] [nvarchar](MAX) NOT NULL
CONSTRAINT [PK_Category] PRIMARY KEY CLUSTERED
(
[CatId] ASC
) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[MeasurementData](
[CatId] [int] NOT NULL,
[DateCollected] [datetime] NOT NULL,
[foo] [int] NOT NULL,
[bar] [float] NOT NULL,
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[MeasurementData] WITH CHECK ADD CONSTRAINT [FK_ MeasurementData _Category] FOREIGN KEY([CatId])
REFERENCES [dbo].[Category] ([CatId])
GO
ALTER TABLE [dbo].[MeasurementData] CHECK CONSTRAINT [FK_ MeasurementData _Category]
GO
To load the data into the system, I was thinking of using BCP to load the CSV into a flat table (into SQL Server 2008) and then project the flat table to the hierarchical table structure.
Q1: Should I attempt this projection using T-SQL or C# (C# app outside of SQL Server)?
Q2: Anyone have an existing algorithm to quickly find (or create and return) the correct leaf given the category identifier above?
FYI, I’m also in the process of wrapping my head around the recursive query syntax using the WITH keyword followed by a common table expression - for when i need to do some recursive programming.
https://stackoverflow.com/questions/tagged/common-table-expression
http://media.pragprog.com/titles/bksqla/trees.pdf
Thanks in advance