2

I have some SQL to tell me the superheroes who don't ever wield a sword.

Creating the table

First, here is the (MySQL) script to create and populate the table.

CREATE TABLE IF NOT EXISTS `mydb`.`weaponry` (
  `hero` VARCHAR(20) NULL,
  `weapon` VARCHAR(20) NULL)
ENGINE = InnoDB

insert into weaponry (hero, weapon) values ('Wonder Woman', 'Lasso of truth');
insert into weaponry (hero, weapon) values ('Wonder Woman', 'Sword');
insert into weaponry (hero, weapon) values ('Wonder Woman', 'Shield');
insert into weaponry (hero, weapon) values ('Spider Man', 'Web');
insert into weaponry (hero, weapon) values ('Gamora', 'Sword');
insert into weaponry (hero, weapon) values ('Gamora', 'Daggar');
insert into weaponry (hero, weapon) values ('Batman', 'Batarang');
insert into weaponry (hero, weapon) values ('Batman', 'Batgrapple');
insert into weaponry (hero, weapon) values ('Batman', 'Tranquilizer gun');

Working code using a NOT IN and a subselect

select distinct hero from weaponry
  where hero not in
  (select hero from weaponry where weapon = 'Sword')

This correctly returns Spider Man and Batman, who don't wield a sword.

Alternative 1: Using NOT EXISTS

I would like to implement the same query using NOT EXISTS. I looked at https://stackoverflow.com/a/5231731/509840. I tried this:

select distinct hero from weaponry
  where not exists
  (select * from weaponry where weapon = 'Sword')

Unfortunately, this returns no rows.

Alternative 2: Using LEFT JOIN

I also took a look at this answer: https://stackoverflow.com/a/17413996/509840 . I was curious to try replacing the NOT IN with a LEFT JOIN.

  select * from weaponry a
    left join weaponry b on a.hero=b.hero and a.weapon=b.weapon
    where a.weapon <> 'Sword'

But this just returns all non-sword heroes (including Wonder Woman and Gamora), since all wield weapons other than swords.

Would you please help me rewrite the NOT IN query with either the NOT EXISTS or the LEFT JOIN?

EDIT: Left Join and the correlated subquery

@JeffUK suggested in his comments that one could understand the correlated subquery better by removing the WHERE clause. Here's a small modification to the select with the where removed.

select distinct a.hero, weaponryInner.hero from weaponry a
    left join 
        (Select * from weaponry b where weapon = 'Sword') as weaponryInner
    on a.hero=weaponryInner.hero 

The results are two columns of heroes, some of which are NULL.

-------------------------------
Hero          |  Hero
-------------------------------
Wonder Woman  |  Wonder Woman
Gamora        |  Gamora
Spider Man    |  <null> 
Batman        |  <null>
-------------------------------

So adding the final where weaponryInner.Weapon is null will return just Spider Man and Batman.

rajah9
  • 11,645
  • 5
  • 44
  • 57
  • 2
    Can I just say this is the best formatted question I've seen on here for a long time. You sir, are a superhero who never wields a sword! – JeffUK Jun 08 '17 at 19:31
  • 1
    Agreed sample data, what has been tried, expected results... what more could a person trying to help ask for. Keep up the good work @rajah9 – xQbert Jun 08 '17 at 19:35

3 Answers3

4

First one you are checking for the existence of any sword wielding heros, you need to check only for the 'current' hero wielding a sword.

select distinct hero from weaponry as a
  where not exists
  (select * from weaponry as b where b.weapon = 'Sword' and b.hero = a.hero )

The second won't work for you, a join works on each row independently, so will do exactly what you're seeing. You could do something like this, a bit more complex but does the job.

We find only the heros with swords in the subquery, and join that to all heros, any heros with a 'NULL' weren't found in the subquery, therefore never wield swords.

select distinct a.hero from weaponry a
    left join 
        (Select * from weaponry b where weapon = 'Sword') as weaponryInner
    on a.hero=weaponryInner.hero 
where weaponryInner.Weapon is null
JeffUK
  • 4,107
  • 2
  • 20
  • 34
  • It would be nice to explain the correlated subquery concept and how it works. and for the left; ahh the good ole absense of data is data.. though not sure why you need the full outer. – xQbert Jun 08 '17 at 19:34
  • Expanded on it a little. The best way to understand what's happening would be to remove the final 'Where' clause, and change it to Select * on the first line. Then run it and review the results – JeffUK Jun 08 '17 at 19:35
  • thank you, both of your alternatives work. To @xQbert's question, I'll make an edit to my post. (Too messy to do inline.) – rajah9 Jun 08 '17 at 22:31
  • Query cost for the NOT EXISTS: 2.80. Query cost for the LEFT JOIN: 7.20. – rajah9 Jun 09 '17 at 13:42
  • Ultimately the left join is, in a round about way, doing a 'not exists' so no surprise it's less efficient than a built-in function to do the same job. – JeffUK Jun 09 '17 at 14:00
2

I think you can use the below:

select distinct hero from weaponry w
LEFT JOIN
(select hero from weaponry where weapon = 'Sword') h ON h.hero = w.hero
WHERE h.hero IS NULL
HGF
  • 389
  • 3
  • 15
  • I made a slight change to the query ("distinct hero" --> "distinct w.hero") and your left join correctly replicated the original NOT IN results. Well done. – rajah9 Jun 08 '17 at 21:08
  • Query cost from SQL explain: 7.20. – rajah9 Jun 09 '17 at 13:39
1
SELECT h.hero
  FROM weaponry h
    LEFT JOIN weaponry w
      ON    w.hero = h.hero
        AND w.weapon = 'Sword'
  WHERE w.weapon is null
  GROUP BY h.hero

OR

SELECT distinct h.hero
  FROM weaponry h
  WHERE not exists( SELECT 1 
                      FROM weaponry w
                      WHERE w.hero = h.hero
                        AND w.weapon = 'Sword'
                  )
Sal
  • 1,307
  • 1
  • 8
  • 16
  • Thank you, both answers work. I have tweaked the LEFT JOIN answer a bit and found that you can use a SELECT DISTINCT and drop the GROUP BY clause. – rajah9 Jun 09 '17 at 13:34
  • Query cost for the LEFT JOIN: 16.20 (as written). Query cost for the LEFT JOIN using a DISTINCT and dropping the group by: 7.20. Query cost for the NOT EXISTS: 2.80. – rajah9 Jun 09 '17 at 13:41
  • Do you have an index on weaponry.hero? or even better on hero and weapon?. – Sal Jun 09 '17 at 15:14
  • No, not using indexes on this small example DB. But interesting speedup when the GROUP BY is dropped and a DISTINCT is added. And NOT EXISTS rocks and will become my new favorite. – rajah9 Jun 09 '17 at 16:04