1

I would like to only insert or update a row, if the following SELECT returns a 0 or no rows.

SELECT (value = ? AND status = ? AND connected = ?)
FROM channels, data
WHERE data.channel_id = channels.channel_id AND channels.channel_name = ? AND sample_time < ?
ORDER BY sample_time DESC
LIMIT 1

Basically, it is a data archiver that only writes changes. That is it only writes data for a given sample_time, if the data is not the same as what was written for the previous sample_time. This SELECT gets the data for a given channel for the previous sample_time and compares it to the data that has come along for the current sample_time. So if this returns 0, that is the data is different, it should go ahead and write it. And if no data has been written for this channel yet, then it should return no rows, and the new data should be written. The following is my query for writing the data:

INSERT INTO data (acquire_time, sample_time, channel_id, value, status, connected)
SELECT ?, ?, channels.channel_id, ?, ?, ?
FROM channels
WHERE channel_name = ?
ON DUPLICATE KEY UPDATE acquire_time = ?, value = ?, status = ?, connected = ?

New data for the current sample_time may overwrite previous data for the current sample_time using the ON DUPLICATE KEY UPDATE, just not if it is the same as the data stored for the previous sample_time.

The duplicate key is the combination of the channel_id and sample_time. There is also a unique index on the channel_id and acquire_time.

Thank you for your time.

Patrick
  • 39
  • 1
  • 1
  • 6

1 Answers1

0

After your clarifications below, I believe this will do what you want:

INSERT INTO data (acquire_time, sample_time, channel_id, value, status, connected)
SELECT ?, ?, channels.channel_id, ?, ?, ?
FROM channels
WHERE channel_name = ?
AND NOT EXISTS (
    SELECT 1
    FROM (
        SELECT value, status, connected
        FROM channels, data
        WHERE data.channel_id = channels.channel_id AND channels.channel_name = ? 
        AND sample_time < ?
        ORDER BY sample_time DESC 
        LIMIT 1 
    ) a
    WHERE a.value = ? and a.status = ? and a.connected = ? 
) 
ON DUPLICATE KEY UPDATE acquire_time = ?, value = ?, status = ?, connected = ?;
squawknull
  • 5,131
  • 2
  • 17
  • 27
  • @squawknull: I'm not sure I understand your example. Does the second part only run if the first part updated a row? I'd like to only run the insert if the second part returned a 1. – Patrick Apr 14 '11 at 16:44
  • I didn't follow your question. I thought you were referring to the insert/update query returning a 1. That's doable though. I'll update my answer. – squawknull Apr 14 '11 at 22:20
  • @squawknull: I'm not sure the second option will work. Having the `(AND value = ? AND status = ? AND connected = ?)` in the `WHERE` clause will return true if any row satisfies those constraints and not just if the one row from the `ORDER BY sample_id DESC` does right? – Patrick Apr 14 '11 at 23:08
  • @squawknull: What if I change it from `AND EXISTS` to `AND 1 =`? – Patrick Apr 14 '11 at 23:42
  • `AND 1=` won't work because it's possible it may return multiple rows, or no rows. The EXISTS clause evaluates as true if there is any match, and by adding the conditions to the where clause on that, it only returns rows if your conditions are met. Lastly, adding the order by to the statement inside EXISTS will have no effect, and in fact might even cause an error. It evaluates as true if any rows are returned, and the order of them is not relevant. – squawknull Apr 15 '11 at 01:29
  • @squawknull: I don't know if it changes the problem much, but I made a mistake. I think I want it to only insert or update if the `SELECT` returns 0. Does that change anything much? Does my first comment still hold though? That putting the conditions in the `WHERE` clause will check if any row matches the conditions and not just the one from the `ORDER BY sample_id DESC`? In other words, first the subquery will get rid of any rows that don't match the `value = ? AND status = ? AND connected = ?`, and return any remaining. I first want to order them, and see if the top row has those conditions. – Patrick Apr 15 '11 at 16:29
  • So it's only the top row you care about if multiple rows are returned? What if no rows are returned? – squawknull Apr 15 '11 at 18:06
  • @squawknull: Yes, if no rows are returned it should go ahead and insert or update. Its basically a data archiver that only archives changes. If the previous sampled value, status, connected for a given channel is the same as the one that is about to be written, then I don't want to write it. If there is no previous entry, then I want to go ahead and write the new one. When I say previous entry, I mean the entry for the previous sample_time, not the previous entry for the current sample_time. – Patrick Apr 15 '11 at 18:13
  • Ah okay. I'll edit my answer, but for the sake of people coming in and looking at this in the future, can you clarify your original question with this information? – squawknull Apr 15 '11 at 18:18
  • @squawknull: Yes, that appears to work perfectly, thank you! Is my clarification ok? – Patrick Apr 15 '11 at 18:54