5

I'm creating an application that as its core function tracks various data over time (blood glucose levels, insulin dosages, food intake etc) and I'm trying to decide how best to organize this information in a database.

At its most basic everything within this particular umbrella is an event, so I thought of having a single Events table with fields for all the properties that might come up. This may be unwieldy, though, because the vast majority of the fields will end up being blank for many of the entires; but i'm not sure if that's actually a problem. The benefit of that way is that it will be easier to call and display all the events. But since many of the events will only have 'timestamp' in common, i question whether they belong on the same table.

I'm not sure that it makes sense to have a table for every kind of event, because taken separately most of the events have only one property other than timestamp, and they will often have to co-mingle. (many types of data will often but not always come in a group)

some types of events have durations. some are comparatively very rare. One class of events are normally a rate that stays the same unless the rate is altered for good or with a temporary override (those are the ones i'm most worried about). Some are simple binary tags (which i was planning on having a linking table for, but to make that easy I would need/prefer an overall event_id to link them with.

My inclination is that is best to have a few tables with closely related types of information rather than one table with everything and a whole lot of space.. but am not quite sure how to proceed.

I would love some advice on strategy for determining the best approach in a situation like this.

edit: Here is a rundown of the data types I'm dealing with in case it makes things more clear

events:
-blood glucose 
     timestamp
     value 
     (tagged w/: from pump, manually entered
     [pre-meal, post-meal (breakfast, lunch, dinner) before bed, fasting, hypo, high, hyper  - which will be either manually entered or inferred based on settings or other user entries], before/after exercise etc i imagine would be better off dynamically generated with queries as necessary. though could apply same paradigm to the meals?

-sensor glucose (must be separate bc it is not as reliable so will be different number from regular bg test, also unlikely to be used by majority of users.)
     timestamp
     amount

-bolus 
     (timestamp)
     bolus total
     food total
     correction total 
     active insulin**
     bolus type - normal[vast majority] square wave or dual wave

-food
     (timestamp)
     carb amount
     carb type (by weight or exchanges) <- this could probably be in user settings table
     food-description
     carb-estimated (binary) 
     meal? - or separate table.
     (accompanying bolus id? though that seems to finicky)

-meals
     timestamp
     mealname (breakfast, lunch, supper) (or mealnames table? seems excessive?)

-basal
     timestamp
     rate per hour
     rate changes throughout day on regular pattern, so either automatically fill in from 'last activated pattern' (in the form midnight: 0.7/hr, 7am: 0.9/hr, 12pm: 0.8/hr etc)
     create new pattern whenever one is used

-temp basal
     (regular basal pattern can be overridden with temporary basal)
     temp basal start
     ?temp basal end and/or temp basal duration
     temp basal amount
     temp basal type -> either in % or specific rate.

-exercise
     start-time
     end-time
     intensity
     ?description (unless 'notes' is universal for any event)

-pump rewind (every 3 days or so)
     -time

-pump prime
     -amount
     -type (fixed or manual)

-pump suspended
     start-time
     end-time

-keytones
     time
     result

-starred
     event

-flagged
     event

-notes
     timestamp
     (user can place a note with any event to provide details or comments, but might want a note where there is no data as well.)

(i want a way for users to flag specific events to indicate they are result of error or otherwise suspect, and to star events as noteworthy either to discuss with doctor or to look at later)

**only place I get active insulin from is when a bolus is entered, but it could be useful other times as a constantly tracked variable, which could be calculated by looking at boluses delivered up to X time ago where X is the Active Insulin Time.

other infrequent events (likely 2-10 per year):
-HbA1C 
     time
     value
-weight
     time
     value
     units
-cholesterol
     time
     value
-blood pressure
     time
     value

-pump settings (will need to track settings changes, but should be able to do that with queries)
     -timestamp
     -bg-target
     -active insulin time
     -carb ratios (changes throughout day like basal)
     -sensitivity
     -active insulin time

concerns. 1) overarching 'events' table with a type, to quickly bring back all events in period of time without having to query every single table? (disadvantage is how do I work with events with duration? have optional end-time on event table?)

2) this is a local database which will generally be one user, and there will never be a need to compare or interact any of the records of other users if it is synced online, so i was thinking of just keeping one version of the database per user, though maybe adding a 'user' id as it is uploaded.

3) many of the events will often go together for ease of interpretation and analysis (blood sugar, meal, food, bolus, notes for instance), i gather it's better to do that after the fact with queries rather than hardcoding anything to maintain integrity.

Some info on what the database will be used for: -A visual representation of all data types over the course of a day -Average all test results and percent of insulin which is used for food, correction, basal. -As well as specific advanced queries like: list up to 20 examples of the difference in glucose level between before bed glucose and morning glucose when no food eaten and no exercise w/in 2 hours of bed, since settings were last changed, etc. -program will automatically assign tags based on parameters. like if >20 carbs are eaten during assigned 'lunch' period, it will say that food is lunch. if there are two food intakes within 30 minutes (or 'meal length' preference) it will group them as one meal.. not totally sure how that will function right now.

Damon
  • 10,493
  • 16
  • 86
  • 144
  • 1
    Isn't the event or reading about a patient ? Don't you have PatientId ? – PerformanceDBA Jan 26 '11 at 21:29
  • For which type of server are you creating this database? There might be vendor specific functions which could help or do you want to use "pure" SQL? If there is such a thing! :) – Tony Jan 31 '11 at 11:55
  • it's going to be in sqlite initially, but synchronizing at some point in the future with a remote server.. probably mysql, but I want it to be flexible in case – Damon Jan 31 '11 at 17:58
  • One thing you haven't specified is by what you are measuring "best". Performance? Maintainability? Ease of use/understanding? While I generally agree with most of what PerformanceDBA has written, a DB with 500 tables is actually far from maintainable by virtue of it being far from easy to understand. Once you do understand it, it is indeed very adaptable and open to change. Having the capability/scope of change does not equate to ease of change. A general approach is to start as normallised as possible and then denormalise as needed. – MatBailie Feb 02 '11 at 15:48
  • that's a good point. Though I am no DB expert. I need a database that is best for me to code advanced queries that analyse the relations and averages of the different recorded information and insert imported data into it without too much difficulty. Which is maybe still profoundly vague, but I don't know what info is best to clarify that. – Damon Feb 02 '11 at 23:13
  • 1
    @Dems. Commercial databases are frequently well over 500 tables, my current RDb is 700+ tables. Use a Prefix in the table name to identify the SubjectArea. Data Models (not DDL) make it easy to understand. Yes. ease of change is dependent on standards, which Normalisation is a part of. – PerformanceDBA Feb 03 '11 at 02:48
  • 2
    @Damon. If you want advanced queries; analysis; comparisons; unrestricted expansion, that pretty much determines a Normalised database, not a de-normalised filing system. – PerformanceDBA Feb 03 '11 at 02:51
  • 1
    @Damon. 1) Feedback please, so that I can progress the model. 2) you have to award the bounty *before* the time runs out, otherwise you lose it, and no one gets it (SO has funny rules about un-awarded bounties). – PerformanceDBA Feb 05 '11 at 09:16

4 Answers4

14

V1.0

Relational Databases, and SQL (which was designed for them) perform much better when the data is organised and Normalised. The one big table is un-normalised, and crippled, in terms of both performance and Relational power.

Your requirement calls for an ordinary Supertype-Subtype cluster of tables. Unfortunately ordinary Relational structures such as this are not "common".

  • The Standard Subtype symbol is the semicircle.

    • The cardinality of the Supertype::Subtype is always 1::0-to-1.

    • The Subtype Primary Key is the Supertype Primary Key. It is also the Foreign Key to the Supertype.

  • There are two types:

    • Exclusive, where there is only one Subtype for each Supertype row, denoted with an X through the semicircle.

    • Non-exclusive, where there is more than one Subtype per Supertype row

  • Yours is Exclusive. This type needs a Discriminator, to identify which Subtype is active for the Supertype row. Where the number of Subtypes is small, Indicators can be used; otherwise a classification table is required.

  • Note that all this, the structures, the rules, the constraints, that are required to support it, and to provide Data Integrity, is available in ordinary IEC/ISO/ANSI SQL. (The Non-SQLs do not comply with the SQL requirement).

Data

  1. Naming is very important. We are advised to name the table by the row, not the content or meaning or action. You speak of Events, but I can only see Readings.

  2. There must be a context for these Readings or Events. I do not see how an EventId hangs in the air. I have assumed that the Readings are about a particular Patient. Please advise, and I will change the model.

  3. Composite or Compound Keys are normal. SQL is quite capable (the Non-SQLs aren't). PatientId already exists as an FK in Reading, and it is used to form its PK. There is no need for an additional ReadingId column and the additional index, which would be 100% redundant.

  4. SQL is also quite capable of handling many tables (the database I am working on currently exceeds 500 tables), and large numbers of smaller tables are the nature of Relational Databases.

  5. This is pure Fifth Normal Form (no columns duplicated; no Update Anomalies).

    • This can be further Normalised to Sixth Normal Form, and thus further benefits can be gained; and the 6NF can be optimised, etc.; but all that is not required here.

    • Some tables happen to be in 6NF, but that is a consequence, not an intent, so it cannot be declared as such.
      .

  6. If you provide information about the limits and overrides that concern you, I can provide a model that resolves those issues.

  7. Since the data is modelled, it is already set up for very fast comparisons (generating alarms, etc.).

▶Reading Data Model◀

Readers who are unfamiliar with the Standard for Modelling Relational Databases may find ▶IDEF1X Notational◀ useful.

Feel free to ask clarifying questions, either as comments, or as Edits to your question.

Caveat

  1. The OO and ORM crowd (lead by Fowler and Ambler) are clueless about Relational technology and Databases. Designing Objects is quite different to modelling data. If you apply their Object design to databases, you will end up with monstrosities that need "re-factoring", and you will have to buy yet another "book" that shows you how to do that efficiently. In the meantime the "database" is crippled.

  2. Relational Databases that are modelled correctly (as data, not objects) never need "re-factoring". In highly Normalised Databases, you can add tables, columns and functions without having to change existing data or code.

  3. Even the concept of ORM is totally flawed. Data has more permanence than Objects. If you model the data first, then model your Objects for the data, it is very stable. But if you model your objects first (which is weird anyway, without an understanding of the data), then model the data after the Objects, you will be going back and forth, constantly correctly both.

  4. Relational Databases have had perfectly ordinary structures such as Supertype-Subtype for over 30 years, and they work well, if they are implemented as that. They are not "gen-spec" or "class-inheritance" or any such OO thing; and if those OO or ORM structures are implemented, without modelling the data correctly, the "database" will be crippled, and need "we-factoring".

    • Additionally, they do not implement the required Data Integrity constraints, so usually data quality is poor. We do not allow bad data to enter the Database; their "databases" are full of bad data, and they need another "book" on how to wash dirty data.
      .
  5. They have the sequence, and the hierarchy, mixed up. Done correctly, there is no "impedance mismatch", no pseudo-technical names to mask pure stupidity; to justify doing the same set of work over and over and over again.

So run like hell from anyone using OO or ORM terminology when dealing with Relational Databases.

V1.1

Your Edit provides far more detail, which of course is demanded, because the context, the whole, is necessary, if data is to be modelled correctly. This incorporates all that info. However, questions remain, and some back-and-forth will be required before it can be complete. Feel free to ask questions about anything that is not absolutely clear; I am not sure exactly what the gap is until I throw something up, and you speak to it.

▶Event Data Model V1.1◀

  1. All my models are pure Relational (retain full Relational power), IDEF1X compliant and Fifth Normal Form (no Update Anomalies). All Rules (business or data/referential Integrity) that are drawn in the model can be implemented as Declaration in ISO/IEC/ANSI SQL.

  2. Never hard-code anything. My models do not require that, and any code working with the database does not have to do that. All fixed text is Normalised into Reference or Look-up tables. (that bit is incomplete,; you need to fill in the gaps).

    • A short alphabetic code is far better than an Enumeration; once you get used to it, the values and meanings become immediately recognisable.

    • Because they are PKs, and therefore stable, you can safely code:

      ... WHERE EventTypeCode = "P"
      or
      ... WHERE EventTypeCode LIKE "T%"

  3. I believe the DataTypes are self-evident or can be worked out easily. If not, please ask.

  4. Everything that your note as "finicky" is perfectly valid. The issue is, since you have not had a database to engage with, you did not know what should be in the database vs what should be or can be SQL code. Therefore all the "finicky" items have been provided for (the database elements), you need to construct the code. Again, if there is a gap please ask.

    • What I am saying is, working in the traditional style of I am the Data Modeller, you are the Developer, you have to ensure every item from your perspective is delivered, rather than relying on me to interpret your notes. I will be delivering a database that supports all the requirements that I can glean from your notes.
      .
  5. One Patient per Database. Let's allow for the possibility that your system is successful, in the future, you will have one central workhorse database, rather than limiting it one database per patient, which would be a nightmare to administer. Let's say that you need to keep all your Patient details in one place, one version of the truth. That is what I have provided. This does not limit you in the short term, from implementing one Db per patient; there is no problem at all with only one row in the Patient table.

    • Alternately, I can strip PatientId out of al the tables, and when you grow into a central database configuration, you will require a major database upgrade.

    • Likewise, if you have Sensors or Pumps that you need to track, please identify their attributes. Any Sensor or Pump attributes would then be Normalised into those tables. If they are "one per patient" that's fine, there will be one row in those tables, unless you need to store the history of Sensors or Pumps.

  6. In V1.0 the Subtypes were Exclusive. Now they are Non-exclusive. This means we are tracking a chronology of Events, without duplication; and any single Event may consist of more than one Subtype. Eg. Notes can be inserted for any Event.

    • Before completion, the EventType list provided needs to be filed out in the form of a grid, showing (a) permitted (b) mandatory Subtypes per EventType. Thate will be implemented as CHECK Constraints in Event.
      .
  7. Naming is very important. I am using ISO standard 11179 (guidelines and principles) plus my own conventions. Reading type Events are prefixed as such. Feel free to suggest changes.

  8. Units. Traditionally, we use either Metric xor US Imperial across the database, allow entry in whatever the user likes, and convert before storage. If you need a mixture, then at least we should have the UnitType specified at the Patient or Pump level, rather than allowing storage of either UnitType. If you really need either UnitType stored, changing back and forth, then yes, we need to store UnitType with each such Value.

  9. Temporal Database. You have Times Series being recorded, and well as interpreted via SQL. Big subject, so read up on it. The minimum I would ask you to read and understand is:

    ▶Temporal Database Performance (0NF vs 5NF)◀

    ▶Classic 5NF Temporal Database◀ (Inspect the Data Model carefully)

  10. Basically the issue boils down to this:

    • Either you have a true 5NF database, no data duplication, no Update Anomalies.

      • That means, for continuous time series, only the StartDateTime is recorded. The EndDtateTime is easily derived from the StartDateTime of the next row, it is not stored. Eg. Event is a continuos chronology; the EventType identifies whether the Event is a specific DateTime or a Period/Duration.

      • EndDateTime is stored only for disjoint Periods, where there are legitimate gaps between Periods; in any case it is clearly identified via the EventType. Eg. Exercise, PumpSuspended. (Incidentally, I am suggesting the patient only knows the actual, as opposed to planned, attributes, at the end of the Exercise period.)

      • Since generally there is no EndDateTime, StartDateTime is simply DateTime. Eg. EventDtm

      • This requires the use of ordinary SQL Subqueries. This is actually quite simple once the coder has a grasp on the subject. For those who don't, I have supplied a full tutorial on Subqueries in general, and using them in a Temporal context in particular, in:

      ▶It Is Easy When You Know How◀. Not coincidentally, re the very same Classic 5NF Temporal Database above.

    • XOR you have a database with EndDateTime stored (100% duplication) with every StartDateTime column, and you can use flat, slow queries. Lots of manipulating large result sets with GROUP BYs, instead of small result sets. Massive data duplication and Update Anomalies have been introduced, reducing the database to a flat file, to supply the needs of coders with limited ability (certainly not "ease of coding").

    • Therefore, consider carefully and choose, for the long term only, because this affects every code segment accessing temporal data. You do not want a re-write halfway down the track when you realise that maintaining Update Anomalies is worse than writing Subqueries.

      • Of course, I will provide the explicit requirements to support a 5NF Temporal Database, correct DataTypes, etc., to support all your identified requirements.

      • Further, if you choose 0NF, I will provide those fields, so that the Data Model is complete for your purpose.

      • In either case, you need to work out exactly the SQL code required for any given query.

  11. DataType handling is important. Do not store Time (hours, etc) as Integer or an offset. Store it only as TIME or DATETIME Datatype. If an offset, store it as Time since midnight. That will allow unrestricted SQL, and Date Arithmetic functions.

  12. Task for you. Go through the model carefully, and ensure that:

    • every non-key Attribute has a 1::1 relationship with its Primary Key

    • and that it does not have a relationship to any other PK (in some other table)

    And of course, check the Model and provide feedback.

Question

Given the above explanations and guidance.

  • What is ReadingBasalTemperature.Type, list values please ?

  • What is HbA1C ?

  • What is KeyTone ?

  • Do we need (ie. Duration/Period EndDateTime`):

    • ReadingBasalTemperatureEnd
    • ReadingBolusEnd
    • Basal Pattern
    • BasalTemp Pattern
    • Actually, what is a pattern, and how is it derived/compared ?
  • How is BasalTemperatureEnd (Or Duration) determined

  • Starting position is, there is no need to store Active Insulin Duration. But you need to define how the EndDateTime is determined. Based on that, if it cannot be easily derived, and or it based on too many factors or changes all the time, storing an EndDateTime might be good.

  • The Pump Settings need clarification.

V1.2

Ok, I have incorporated all information you have proved in the question and the comments. Here is a progressed Data Model.

▶Event Data Model V1.2◀

There are still some issues to be resolved.

  • Use a Percentage or a Rate only, not both with an additional indicator. One can be derived from the other. I am using Rate consistently.

  • ... the only worry about the approach is that for many days the basal rate will be identical.. hence redundancy

    • That is not "redundancy". That is storage of a time series of facts, which happen to be unchanging. The queries required are straight-forward.

    • However, in advanced use, yes, you can avoid storing an unchanged fact, and instead extend the duration to include the new time interval.

  • I am still not clear re your explanation of Basal Temp. Please study the new Model. First, the patterns are now stored separately. Second, we are recording a Basal Temp Start with a Rate. Do we need a Basal Temp End (with a Rate) ?

  • "GlucoseEventType would be able to have more than one value per Glucose Result" needs more definition. Don't worry about ID keys. Just tell me about the data. For each ReadingGlucoseBlood, name the result values, and which GlucoseEventType they apply to; which are mandatory and which are optional.

  • PumpHistory.InsulinEndDateTime is the ending Instant for the Duration. Of course that is generic, the starting Instant is whatever row you compare it to. Thus it should be seconds or minutes since midnight 01 Jan 1900.

  • Check the new Event PK. Where the incoming record identifies several Events, you need to parse that, and INSERT each Event-EventSubtype row, using the same DateTime.

  • Except for Patient, there are no ID keys in this database, none are required thus far. Refer to the parent by full PK.

05 Feb 11

No feedback received re V1.2.

a lot of the data i'm getting is being pulled from an external (and somewhat disorganized) csv which groups certain event types under one row and often has events on the same second, which is as granular as it gets

That is easy to overcome. However, that means that an Instant is not an Instant. Now, I could walk you through the whole exercise, but the bottom line is simple.

  • If you really need it, we could add a SequenceNo to the PK, to make it unique. But I suspect the EventTypeCode is enough (there will not be more than one EventType per second). If not, let me know, and I will change the moel.

  • Retain the meaning of an Instant as an Instant, and thus avoid departing from the architectural requirements of Temporal Databases.

  • Use EventType to afford uniqueness to the DateTime Pk.

    • Keep in mind that the EventTypeCode is deployed in the Event PK, not as a Discriminator requirement, but to afford uniqueness. Thus its presence in the PK of the Subtypes is an artefact, not that of a Discriminator (which is already known by virtue of the Subtype).
  • However there is unnecessary complexity due to the Non-exclusive Subtype (there can be more than one Subtype per Supertype row).

  • Therefore I have changed it back to an Exclusive Subtype, deterministic. One EventType per Supertype row; max one Subtype.

Refer to Implementing Referential Integrity for Subtypes for specific information re Constraints, etc.

The change to the Data Model is too small to warrant another release. I have updated the V1.2 Data Model.

06 Mar 11

Due upholding the "above all, be technically honest" guideline in the FAQ, and confronting misinformation as requested therein, I was suspended for my efforts (which means I will no longer correct misinformation on SO, and such posters have protected reign). Interaction with the seeker was carried on, to completion, and the Final Data Model was completed, away from SO. The progression is therefore lost to SO readers. However, it may be of value to post the result, the ▶Final Data Model V1.16◀.

  • Events always have a starting Instant (Event.DateTime).
  • Events may be Durations, in which case an ending Instant (Event) is required.
  • Some Events consist of only the Supertype; others require a Subtype. This is identified in third column of the EventType exposition.
  • The fourth column identifies the type of Event:
    • Instant or Duration
    • Duration: Conjunct or Disjunct
  • Note that the resolution of DateTime on the seeker's platform is one second, and many Events may occur in one second, but not more than one of the same EventType. EventTypeCode has therefore been included in the Event Primary Key to implement that rule. Thus it is an artefact, it is not a generic requirement for a supertype-subtype structure or for Exclusive/Non-exclusive subtypes.
  • Intended for printing on two facing US Letter pages, enlarged or not, with a gusset.
Community
  • 1
  • 1
PerformanceDBA
  • 32,198
  • 10
  • 64
  • 90
  • I agree entirely with this, you should make it into a blog post, but I don't think it is actually what he is asking. – Ken Downs Jan 26 '11 at 23:44
  • 1
    @Ken. It is probably best to let Damon decide that. The Caveat is to counter the misinformation on SO. – PerformanceDBA Jan 27 '11 at 05:38
  • @PerformanceDBA - i added some clarification about the data types i am dealing with. much of it is being interpreted by data imported from an insulin pump. – Damon Jan 28 '11 at 02:52
  • 1
    @Damon. I Have answered the original question, and provided a Data Model detailing the structure required. The **edit** is way more information, and changes things quite a bit. Also, it will take some back-and-forth before we get to a complete Data Model, probably 20 tables. Simple questions on SO are worth 10+15 points; this one is very large. I am happy to do that for a bounty, whatever it is worth to you. – PerformanceDBA Jan 29 '11 at 01:32
  • 1
    @Damon. Take a look at [this question re the attention I give the task and a finished Data Model](http://stackoverflow.com/questions/4491173/historical-auditable-database/4573129#4573129). You will also need some [help with the SQL to handle Temporal data (Time Series)](http://stackoverflow.com/questions/4682170/sql-to-find-the-first-occurance-of-sets-of-data-in-a-table/4686539#4686539). – PerformanceDBA Jan 29 '11 at 07:06
  • @Damon. Wow, that is generous! Answer expanded. Over to you. – PerformanceDBA Jan 30 '11 at 04:42
  • sorry i haven't had a chance to go over it yet.. it looks amazing at first glance, though – Damon Jan 31 '11 at 07:06
  • ReadingBasalTemporary.Type is either percentage (of regular basal rate) or units per hour. i think i mention that hba1c is glycocelated hemoglobin test .. i will have to get back to you on the units but we can leave it out for now, same for ketones (which is similar to blood sugar but reserved for when bg is particularly high) Basal patterns (And carb ratios) are set over the course of a day, basically like 0:00-6:00: 0.7, 6:00-15:30: 0.5, 15:30-19:55: 0.8, etc. Some pumps allow users to store a few different patterns and switch between them. – Damon Jan 31 '11 at 07:31
  • the reason i went with events over readings is that only certain information is provided by the pump.. a lot of other information (exercise, what food was eaten, notes, what meal it was etc) is added by the user or inferred by the program. – Damon Jan 31 '11 at 07:35
  • I like the idea of keeping patient_id in.. no reason not to. I want to eventually offer web sync, but not require an internet connection, so patient id may differ from local to remote, but that can be easily stored. There will be user preferences which could include pump model, as well as normal mealtimes (so it can automatically assume lunchtime on a carb event greater than 20 carbs between noon and 2pm for instance) – Damon Jan 31 '11 at 07:46
  • My main concern right now is I've just never worked with a number of tables like this so it will be fun coming up with the queries. – Damon Jan 31 '11 at 07:49
  • still wrapping my brain around the duration stuff. I may need to sleep on it :O. I think for basal.. it might be ok to wrap it all into one table actually, since the total rate is what matters, and have an isTemp to denote if it is temporary (which can be determined from the imported data). because users will want to be able to see when their temporary basals started etc, but the basal rate is really what matters. the only worry about the approach is that for many days the basal rate will be identical.. hence redudundancy – Damon Jan 31 '11 at 08:05
  • other comment about duration. the vast majority of boluses are instant(for all intents and purposes) but special bolus types (square and dual wave) do have a duration. – Damon Jan 31 '11 at 08:06
  • I like the idea of universal metric datatypes and recalculating on view if necessary. – Damon Jan 31 '11 at 08:07
  • initial plan is to only support one particular pump. sensor i have in there for theoretical in future. other pumps would potentially want to be supported as well. pump history would be a good idea actually though it's very uncommon for people to switch pumps, it does happen – Damon Jan 31 '11 at 08:10
  • pump settings are things like the insulin to carb ratios, basal rates, correction factor, active insulin time. they are used by the pump to calculate bolus amounts, correction boluses etc.. so their derived data is already stored, but settings ought to be tracked so that a user can see what the settings were at any point in time. i think that addresses the questions you had.. if there are others do let me know – Damon Feb 01 '11 at 23:36
  • GlucoseEventType would be able to have more than one value per Glucose Result.. So I'd include a GlucoseEventType table with ReadingGlucoseBloodId as a foreign Key. Unless that seems foolish! – Damon Feb 02 '11 at 03:57
  • also i've been looking at the data and much of the stuff I will be getting from pump important has identical timestamp (it records certain data as one record), so patient_id/eventdtm will not be a sufficient key. So if each event has to have its own id, then I gather it's not useful to include the patient_id and eventdtm in the event type child categories and just use the parent's ID? – Damon Feb 02 '11 at 05:35
  • i realized i hadn't discussed much what the data will be used for, so i elaborated on that a bit at the end of the question – Damon Feb 03 '11 at 00:13
  • @Damon. V1.2 Data Model posted, with outstanding issues. Over to you. – PerformanceDBA Feb 03 '11 at 03:29
  • @Damon. Added a **05 Feb 11** section. Minor update to the Data Model. Waiting for feedback, to ensure the answer is progressed and complete. – PerformanceDBA Feb 05 '11 at 12:29
  • Sorry for the delay. Re: temporary basals (the long name for temp basal), they would require an endtime. They need to be represented separately from just a simple basal rate practically, because a patient will be concerned with examining temporarly basals separately (they're used around exercise or illness or various other events that would affect blood sugar). – Damon Feb 05 '11 at 23:14
  • basically, the information that i get from the pump output (at least for the brand i'm initially supporting) is a rundown of basal patterns, and an event when a new pattern is activated, and an event for each temporary basal including rate (in units per hour or percentage of regular basal (which complicates because the rate is not necessarily fixed for the length of the temporary basal if it goes over when the regular basal pattern changes)) how i incorporate that into the data model is one of the more complicated aspects of this. – Damon Feb 05 '11 at 23:17
  • what i meant by "GlucoseEventType would be able to have more than one value per Glucose Result" is that multiple types can apply to a single result, which would as i understand things require a table for event-types rather than just a column on the glucose event table – Damon Feb 05 '11 at 23:23
  • but yes exercise and temp basal should have both start and end times! if you think storing the patterns in a separate table rather than inserting the changes in rate directly into the database is best, we can go with that. I'm not sure. I like that method because it gives an easy way to display current basal patterns – Damon Feb 05 '11 at 23:32
  • @PerformanceDBA i got the bounty for you since you've definitely gone above and beyond on helping me with this. thanks! if there are some questions I still haven't addressed, let me know. – Damon Feb 06 '11 at 00:01
  • @Damon. Thank you very much, and it is my pleasure. It is Sunday here, I will get back to it shortly, and take you to completion. – PerformanceDBA Feb 06 '11 at 06:11
  • @Ken. Now we can have that discussion. – PerformanceDBA Feb 06 '11 at 09:51
  • @PerformanceDBA I'm stil not entirely understanding the specific practical benefit of having a separate table for the endtime of an event rather than just including the endtime as a property of the event info when it's a sporadic event rather than a continuing pattern – Damon Feb 11 '11 at 20:58
4

No, the second you say something like "the vast majority of the fields will end up being blank for many of the entries", your schema is almost certainly broken.

Blood glucose should be in its own table with nothing more than a date/time and level. Food intake similarly, if you're just storing the carb content. If you're storing the individual components of the meal, you should have a one to many relationship between the food-intake table and the food-components table.

Same as the insulin, with a timestamp and quantity.

The right way to represent the relationships between disparate objects is with, well, relationships: foreign keys and such.


And, as a "nothing-to-do-with-the-programming-side-of-it" aside (on the co-mingling of events), if you're taking your insulin and having your meal at the same time, you're probably doing it wrong. Most fast-acting insulin takes half an hour to start doing a proper job and the food entering your gut will start the production of sugars in about five to ten minutes. So there should be about a half-hour delay between shooting up and eating.

The only usual "same-timestamp-or-thereabouts" events that I'm aware of would be the blood sugar level immediately before the meal to ensure the insulin you injected half an hour ago has started working.

But I'm not your endocrinologist and I don't even play one on TV :-) So check with your medical bods first.


paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • thanks for the response. i was not particularly sure if having empty fields was a actually a disadvantage. re:comingling.. you have a good point. though my team has never recommended I take insulin that way I have seen it recommended (it's logical, but the drawback being that things can change in half an hour.. not just how much you're eating but what if dinner is accidentally destroyed etc) the take away though is that everyone has different habits and i should be able to accommodate that. – Damon Jan 26 '11 at 21:03
2

"My inclination is that is best to have a few tables with closely related types of information rather than one table with everything and a whole lot of space.. but am not quite sure how to proceed."

Simple and best. To see why, we can examine the alternatives.

One table per metric. Fully normalized but incredibly fatiguing. I'm normally a normalization hawk, and also I normally say 'tables are free', this is not quite true when they start to number in the dozens or hundreds for like things. So we'd prefer something simpler.

At the other end we have E-A-V, one table, with values like test_id, metric_id, value. This is well known to be nigh-on impossible to query and work with. Like the venus flytrap, draws you in with sweet smelling nectar, then closes in on you and eats you.

On the gripping hand, there is the one large table with all possible columns. This is called the 'sparse' solution for obvious reasons. I did this in Direct Marketing and it works well, but that was a highly specialized situation, this approach is not generally recommended.

So somewhere in the middle is a handful of tables, one table for each set of tests with similar values to store.

Hmmmm, which is just what you suggested. Sounds good!

Ken Downs
  • 4,707
  • 1
  • 22
  • 20
  • 1
    1) Normalisation hawks do not post such advice, you are contradicting yourself. 2) By definition, the fat table is Dense, the Normalised tables are Sparse. – PerformanceDBA Jan 27 '11 at 05:34
  • On what basis is an Entity-Attribute-Value table "nigh-on impossible to query and work with"? If you re-organise your describtition to A-E-V you effectively have all of your metric tables stacked on on top of the others. If you can work with many tables, you can work with this. – MatBailie Feb 02 '11 at 15:32
0

Take a look at these SO examples: one, two, three, four.

Community
  • 1
  • 1
Damir Sudarevic
  • 21,891
  • 3
  • 47
  • 71