6

The following query returns 2036 rows:

SELECT "FooUID" from "Foo" f
LEFT JOIN "Bar" b ON f."BarUID" = b."BarUID"
WHERE f."BarUID" IS NOT NULL AND b."BarUID" IS NULL

But the following statement only updated 1870 rows:

UPDATE "Foo" f1 set "BarUID" = 'aNewUID'
WHERE f1."FooUID" IN (
   SELECT f2."FooUID" from "Foo" f2
   LEFT JOIN "Bar" b ON f2."BarUID" = b."BarUID"
   WHERE f2."BarUID" IS NOT NULL AND b."BarUID" IS NULL
)

How is this possible?

EDIT 1: The first query continues to return 166 rows, and the second continues to update 0 rows.

EDIT 2:

In the following, the nested query returns a row containing a UID, but the outer query returns 0 rows.

SELECT * from "Foo" f1
WHERE f1."FooUID" = (
   SELECT f2."FooUID" FROM "Foo" f2
   LEFT JOIN "Bar" b ON f2."BarUID" = b."BarUID"
   WHERE f2."BarUID" IS NOT NULL AND b."BarUID" IS NULL
   LIMIT 1
)

Am I crazy?

EDIT 3:

The following statement, provided by @wildplasser succeeded in updating the remaining 166 rows:

UPDATE "Foo" ff
SET "BarUID" = 'aNewUID'
WHERE ff."BarUID" IS NOT NULL
AND NOT EXISTS (
   SELECT * FROM "Bar" bb
   WHERE bb."BarUID"= ff."BarUID"
)

However, I still don't understand why the original didn't pick them up. If the nested query selected 166 "FooUID"s, why would they not be matched to rows in the "Foo" table using IN?

EDIT 4: The more I think about it, this background might be important:

This all took place on a database server that was recently cloned from another one. I spoke to the IT guy who did the cloning, and it turns out he didn't shut down an application running on top of the original DB before bringing it down to clone it. This means the DB was mostly likely brought down mid-transaction (I don't know how ungracefully). Is it possible something in the database was left in a corrupted state, leading me to see these phantom rows?

Unfortunately I can no longer repro it, since running wildplasser's fix. The original DB (up and serving the application again) has none of the invalid data I was trying to fix on the copy, much less any trace of the shenanigans I witnessed.

I should mention that before running the fix, I reduced the issue to the most basic absurdity: I first selected the FooUID from the nested query in Edit 2, copied it to the clipboard, then ran a query selecting from Foo where FooUID equaled the pasted value - this still returned 0 rows.

Community
  • 1
  • 1
Paul Bellora
  • 54,340
  • 18
  • 130
  • 181
  • What type is `FooUID` and which one is returned by the inner query in EDIT 2? No idea what's going on yet, just poking around a bit. – mu is too short Sep 11 '11 at 21:56
  • @mu is too short - `FooUID` is type `uuid`. Not sure I understand the second part of your question though. – Paul Bellora Sep 11 '11 at 22:03
  • Your EDIT 2 has me stumped. Must be a value that doesn't compare equal to itself. You're _sure_ the inner part is returning a non-NULL `FooUID`? – Scott Lamb Sep 13 '11 at 21:41
  • @Scott - Absolutely. See my last paragraph - ultimately it wasn't even about nested queries so much as pure weirdness. At the time of running wildplasser's fix I was relieved to get rid of the phantom rows, but now I'm wishing I'd preserved them as proof. I'm going to try reproing it when I get the time, but it's a longshot. – Paul Bellora Sep 13 '11 at 21:55
  • Hmm. You had a couple sessions going, then; is it possible that your "EDIT 2" update and that last select were in a session where a transaction had been going since before wildplasser's update while the session where you did just the inner update had a transaction started afterward? – Scott Lamb Sep 13 '11 at 22:06
  • @Scott - That's a good thought, but I don't think so. This was all through one pgAdmin query window, so I assume the same session. The EDIT 2 and last paragraph of EDIT 4 queries were run before the fix in EDIT 3. After the fix everything appears normal. Sorry if the question is confusing with so many edits. – Paul Bellora Sep 14 '11 at 01:40
  • I'm out of ideas then. If you figure it out, I'd like to know. – Scott Lamb Sep 14 '11 at 01:45
  • @Scott - Thanks for racking your brain. As I said I will try and repro when I get a chance. Take care. – Paul Bellora Sep 14 '11 at 01:59

1 Answers1

2

What happens if you rewrite this with NOT EXIST, like

UPDATE Foo ff
SET baruid = 'aNewUID'
WHERE ff.baruid IS NOT NULL
AND NOT EXISTS (SELECT * FROM bar bb
    WHERE bb.baruid = ff.baruid
    );

Looks much cleaner to me than selecting the limb leg of an outer join.

wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • Also, a foreign key constraint would probably avoid this mess. – wildplasser Sep 11 '11 at 20:28
  • Funny you should mention that, this is me trying to clean up bad data to add foreign keys. – Paul Bellora Sep 11 '11 at 20:34
  • Gheghe. Try adding a correlation name to the subquery. Maybe someone got confused seeing the two foos. You could also check the wuery plans (correlation names come in handy there, too) – wildplasser Sep 11 '11 at 20:57
  • Okay, finally got to try your version of the `UPDATE` - it worked, updating the remaining 166 rows! But I still don't understand why this was. – Paul Bellora Sep 11 '11 at 22:05
  • Yeah I did try the original query with correlation names - it had the same effect. I'll update my question to include that. – Paul Bellora Sep 11 '11 at 22:29