Here's my situation. I have a table with a bunch of URLs and crawl dates associated with them. When my program processes a URL, I want to INSERT a new row with a crawl date. If the URL already exists, I want to update the crawl date to the current datetime. With MS SQL or Oracle I'd probably use a MERGE command for this. With mySQL I'd probably use the ON DUPLICATE KEY UPDATE syntax.
I could do multiple queries in my program, which may or may not be thread safe. I could write a SQL function which has various IF...ELSE logic. However, for the sake of trying out Postgres features I've never used before, I'm thinking about creating an INSERT rule - something like this:
CREATE RULE Pages_Upsert AS ON INSERT TO Pages
WHERE EXISTS (SELECT 1 from Pages P where NEW.Url = P.Url)
DO INSTEAD
UPDATE Pages SET LastCrawled = NOW(), Html = NEW.Html WHERE Url = NEW.Url;
This seems to actually work great. It probably loses some points on the "code readability" standpoint, as someone looking at my code for the first time would have to magically know about this rule, but I guess that could be solved with good code commenting and documentation.
Are there any other drawbacks to this idea, or maybe a "your idea sucks, you should do it /this/ way instead" comment? I'm on PG 9.0 if that matters.
UPDATE: Query plan since someone wanted it :)
"Insert (cost=2.79..2.81 rows=1 width=0)"
" InitPlan 1 (returns $0)"
" -> Seq Scan on pages p (cost=0.00..2.79 rows=1 width=0)"
" Filter: ('http://www.foo.com'::text = lower((url)::text))"
" -> Result (cost=0.00..0.01 rows=1 width=0)"
" One-Time Filter: ($0 IS NOT TRUE)"
""
"Update (cost=2.79..5.46 rows=1 width=111)"
" InitPlan 1 (returns $0)"
" -> Seq Scan on pages p (cost=0.00..2.79 rows=1 width=0)"
" Filter: ('http://www.foo.com'::text = lower((url)::text))"
" -> Result (cost=0.00..2.67 rows=1 width=111)"
" One-Time Filter: $0"
" -> Seq Scan on pages (cost=0.00..2.66 rows=1 width=111)"
" Filter: ((url)::text = 'http://www.foo.com'::text)"