1

There are two database tables described in the following. The asterisk highlights the primary keys.

+----------------+    +----------------+
| posts          |    | url_references |
+----------------+    +----------------+
| id*            |    | url*           |
| post_content   |    | post_id        |
+----------------+    +----------------+

I want to insert or update a post based on the existence of the corresponding entry in the table url_references. What is the best combination of SQL commands to reach this? I would like to avoid to process the decision about insert or update in PHP.
The following scenarios describe the alternative step by step behavior using PHP commands.

SELECT * FROM url_references WHERE url = $url;

Scenario 1: Insert new entry.

// mysql_num_rows() returns 0
INSERT INTO post (post_content) VALUES ($postContent);
$postId = mysql_insert_id();
INSERT INTO url_references (url, post_id) VALUES ($url, $postId);

Scenario 2: Update existing entry.

// mysql_num_rows() returns 1
$row = mysql_fetch_array($rows);
$postId = $row['post_id'];
UPDATE posts SET post_content = $postContent WHERE id = post_id;

Edit 1: Note, that I cannot check for the id in posts directly! I want to manage (insert/update) posts based on their url as the primary key.

JJD
  • 50,076
  • 60
  • 203
  • 339
  • 2
    [How do I update if exists, insert if not (aka upsert or merge) in MySQL?](http://stackoverflow.com/questions/1218905/how-do-i-update-if-exists-insert-if-not-aka-upsert-or-merge-in-mysql) – dee-see Aug 21 '11 at 11:19
  • Yes, I read about it. But I have to deal with 2 tables. – JJD Aug 21 '11 at 11:25
  • @Vache: Good shot! I would add that without using MySQL-specific extensions one need to use [`SELECT ... FOR UPDATE`](http://dev.mysql.com/doc/refman/5.5/en/innodb-locking-reads.html) syntax, then to `UPDATE` or `INSERT` and then `COMMIT`. That would be a standard way of solving the task like this (`autocommit=off` of course). – dma_k Aug 21 '11 at 11:34
  • @dma_k: Could you explain to me how you would use `SELECT ... FOR UPDATE` with my example? It sounds promising though I do not quite understand how to use it. – JJD Aug 21 '11 at 21:03
  • 1
    In some scenarios it might be a good option to move some of the logic to a [stored procedure](http://dev.mysql.com/doc/refman/5.1/en/stored-routines.html). – tuomassalo Sep 23 '11 at 08:53
  • In scenario 2, do you mean `$postId = $row['post_id'];`? – TRiG Oct 01 '14 at 10:35

2 Answers2

1

If you would like to do above in a classical way, do the following.

First we may agree that url is a natural primary key. Anyway you need an index in this column to speedup your lookups:

CREATE UNIQUE INDEX url_references_idx ON url_references(url);

After that if you execute:

INSERT INTO url_references (url) VALUES ($url);

you end up with two scenarios:

• The INSERT succeeds. That means your $url is new and you can proceed with:

INSERT INTO posts (id, post_content) values (NULL, $postContent);
SELECT LAST_INSERT_ID(); // This will return $post_id
UPDATE url_references SET post_id = $post_id WHERE url = $url;
COMMIT;

In this scenario the lock on newly inserted row in url_references guarantees that another thread will go the 2nd scenario, if 1st transaction is successfully committed (or 1st scenario if it fails).

• The INSERT fails. That means your $url is already known and you can proceed with:

SELECT post_id FROM url_references WHERE url = $url;
UPDATE posts SET post_content = $postContent WHERE id = $post_id;
COMMIT;

Note: The first INSERT statement guarantees that racing condition on url_references table is correctly handled provided you have enabled a correct transaction isolation level and autocommit=off.

Note: Using SELECT ... FOR UPDATE does not work in this case, as it will lock only existing rows (and we need to lock non-existing row, which is about to be inserted). Sorry if I confused you, please ignore my comment under your question.

dma_k
  • 10,431
  • 16
  • 76
  • 128
  • Is there a way to pipe the result of `SELECT LAST_INSERT_ID();` into the `UPDATE`query? I would like to avoid to process the `$post_id` via PHP since I do not really need it there. – JJD Aug 24 '11 at 15:27
  • Just try it and edit my post :) I will be happy to incorporate your improvements. I have also found a slight but critical error... – dma_k Aug 25 '11 at 10:15
  • Thank you for edits. Still my question is: Can I avoid processing the result set of the `SELECT` in PHP? – JJD Aug 25 '11 at 14:05
  • One thing appears wrong to me: (1) you insert into table `url_references`, (2) you insert into table `posts`, (3) you ask for the `LAST_INSERT_ID`. --- The order should be 1-3-2 since the table `url_references` wants the `post_id`. Please correct me if I am wrong. – JJD Aug 25 '11 at 20:55
  • You have stated that `url_references` does not have and id :) If you need it somewhere else in your application, that would be another `SELECT LAST_INSERT_ID()`, so 1-3-2-3. But for above example it was not needed. – dma_k Aug 26 '11 at 09:57
0

You can use REPLACE INTO to avoid deciding between INSERT/UPDATE on the posts table, then based on the number of rows affected by REPLACE determine whether you need to INSERT into url_references. Something like this:

$sql = "REPLACE INTO posts (post_id, post_content) VALUES ($postId, $postContent)";
$result = mysql_query($sql);
$numRows = mysql_num_rows($result);

if ($numRows == 1) {
    // was INSERT not DELETE/INSERT (UPDATE) - need to insert into url_references

    $postId = mysql_insert_id();
    $sql = "INSERT INTO url_references (url, post_id) VALUES ($url, $postId)";

    ... do SQL INSERT etc
}

From the MySQL REPLACE documentation:

The REPLACE statement returns a count to indicate the number of rows affected. This is the sum of the rows deleted and inserted. If the count is 1 for a single-row REPLACE, a row was inserted and no rows were deleted. If the count is greater than 1, one or more old rows were deleted before the new row was inserted.

Note that I haven't tested the above pseudo-code, so you may have to tweak it.

See: http://dev.mysql.com/doc/refman/5.0/en/replace.html for reference.

JJ.
  • 5,425
  • 3
  • 26
  • 31