-1

Given a table named RECORD in mysql with following structure:

rid(pk & AI)   patientid(fk) recordTYPE(varchar)  recordValue(varchar) recordTimestamp(timestamp)
     1             1             temperature(℃)           37.2              2015-08-11 18:10:04
     2             1             weight(kg)                65.0              2015-08-11 18:20:08                    
     3             1             heartbeat(bpm)            66                2015-08-11 18:30:08
     4             1             temperature(℃)           36.8              2015-08-11 18:32:08

You can see that for the same date, there can be multiple records for one particular type of record. e.g. temperature in the sample data :

   rid        patientid          recordTYPE             value              recordtimestamp
    1             1             temperature(℃)           37.2              2015-08-11 18:10:04
    4             1             temperature(℃)           36.8              2015-08-11 18:32:08

In this case, we should choose the latest record. i.e. the record with rid = 4 and value = 36.8 .

Now given an input date e.g. '2015-8-11', I want to do a query to obtain something like:

   date      patientid  temperature(℃)    weight(kg)    heartbeat(bpm)
2015-08-11       1         36.8              65.0            66
2015-08-11       2         36.5              80.3            70
2015-08-11       3         35.5              90.5            80
     ..........................................................
     ..........................................................
2015-08-11       4         35.5              null            null
                                 Fig. 2

In addition, you can see that for a particular date, there may not be any records of some types. In this case, the value in that column is null.

I tried the following query:

   SELECT max(recordTimestamp), patientid, recordTYPE, recordValue  
   FROM RECORD 
   WHERE date(recordTimestamp) = '2015-08-11'
   GROUP BY patientid, recordTYPE

The result is something like:

   date      patientid     recordTYPE      recordValue    
2015-08-11       1       temperature(℃)    36.8              
2015-08-11       1         weight(kg)       65.0            
2015-08-11       1        heartbeat(bpm)    66
2015-08-11       2       temperature(℃)    36.5              
2015-08-11       2         weight(kg)       80.3            
2015-08-11       2        heartbeat(bpm)    70
2015-08-11       4       temperature(℃)    35.5              
                                 Fig. 4

The questions are:

  1. Given this table RECORD, what is the proper mysql statement (in terms of performance such as retrieval speed) to produce the desired result set (i.e. Fig.2)?

  2. Will it be better (in terms of facilitating query and scalability such as adding new types of record) if the db design is changed? e.g. Create one table for each type of record instead of putting all types of record in one table.

Any suggestion is appreciated as I'm a db novice...... Thank you.

  • Structure's OK; it really depends on how the table will be used (how will data get in there, what queries will be run against it, will it be linked with data in other tables, etc); so can't say for sure that it's the best solution, but no alarm bells. `recordTYPE` would be best replaced with a foreign key referencing a different table; that way you reduce the size of each row & improve maintainability. – JohnLBevan Aug 12 '15 at 08:20

2 Answers2

0

You can try this:-

SELECT MAX(rid), patientid, recordTYPE, MAX(recordValue), recordTimestamp
FROM YOUR_TABLE
WHERE recordTimestamp = '2015/08/11'
GROUP BY patientid, recordTYPE, recordTimestamp;
Ankit Bajpai
  • 13,128
  • 4
  • 25
  • 40
0

Here's one way to do it. SQL Fiddle Demo

Sadly MySQL doesn't support the row_number() over (partition by ...) syntax which would have simplified this a lot. Instead I've made excessive use of a trick discussed here: https://stackoverflow.com/a/3470355/361842

select `date`
, `patientId`
, max(case when `tRank`=1 then `temperature(℃)` else null end) `temperature(℃)`
, max(case when `wRank`=1 then `weight(kg)` else null end) `weight(kg)`
, max(case when `hRank`=1 then `heartbeat(bpm)` else null end) `heartbeat(bpm)`
from
(
    select case when @p = `patientId` and @d = cast(`recordTimestamp` as date) then @x := 1 else @x := 0 end
    , case when @x = 0 then @t := 0 end
    , case when @x = 0 then @w := 0 end
    , case when @x = 0 then @h := 0 end
    , case `recordType` when 'temperature(℃)' then case @x when 1 then @t := @t + 1 else @t := 1 end else null end as `tRank`    
    , case `recordType` when 'weight(kg)'      then case @x when 1 then @w := @w + 1 else @t := 1 end else null end as `wRank`    
    , case `recordType` when 'heartbeat(bpm)'  then case @x when 1 then @h := @h + 1 else @t := 1 end else null end as `hRank`
    , case `recordType` when 'temperature(℃)' then `recordValue` else null end as `temperature(℃)`    
    , case `recordType` when 'weight(kg)'      then `recordValue` else null end as `weight(kg)`    
    , case `recordType` when 'heartbeat(bpm)'  then `recordValue` else null end as `heartbeat(bpm)`
    , @d := cast(`recordTimestamp` as date) as `date`
    , @p := `patientId` as `patientId`
    from `Record`
    cross join 
    (
      SELECT @t := 0
      , @w := 0
      , @h := 0
      , @p := 0
      , @x := 0
      , @d := cast(null as date)
    ) x
    order by `patientId`, `recordTimestamp` desc
) y
group by `date`, `patientId`
order by `date`, `patientId`

Breakdown

This says that if this is the last temperature of the day for the current grouping's partientId/date combo then return it; otherwise return null. It then takes the max of the matching values (which given all but 1 are null, gives us the one we're after).

, max(case when `tRank`=1 then `temperature(℃)` else null end) 

How tRank = 1 means the last temperature of the day for a patientId/date combo is explained later.

This line says that if this record has the same patientId and date as the previous record then set x to 1; if it's a new combo set it to 0.

select case when @p = `patientId` and @d = cast(`recordTimestamp` as date) then @x := 1 else @x := 0 end

The next lines say that if we have a new patiendIt/date combo, reset the t, w and h markers to say "the next value you receive will be the one we're after".

, case when @x = 0 then @t := 0 end

The next lines split the data by recordType; returning null if this record isn't their record type, or returning a number saying what how many of this type of record we've now seen for the patientId/date combo.

, case `recordType` when 'temperature(℃)' then case @x when 1 then @t := @t + 1 else @t := 1 end else null end as `tRank`    

This is similar to the above; except instead of returning a combo-counter it returns the value of the current record (or null if this is a different record type).

, case `recordType` when 'temperature(℃)' then `recordValue` else null end as `temperature(℃)`    

We then record the current record's date and patientId values, so we can compare them with the next record on the next iteration.

, @d := cast(`recordTimestamp` as date) as `date`
, @p := `patientId` as `patientId`

The cross join and following subquery is just used to initialise our variables.

The (first) order by is used to ensure that comparing current and previous records is enough to tell if we're looking ata different combo (i.e. if all combos are grouped then any change is easy to spot; if the combos keep alternating we'd need to keep track of every combo we'd seen before). recordTimestamp is sorted descending so that the first record we see on a new combo will be the last record that day; the one we're after.

The group by is used to ensure we get 1 result per combo; and the last order by just to make our output ordered.

Community
  • 1
  • 1
JohnLBevan
  • 22,735
  • 13
  • 96
  • 178