1

I've got two tables that have one to many associations on a pmid. So if one table has an pmid, the second table should have multiple rows with the same pmid. However, something went sideways and I'm missing my latest batch of pmids in the second table. These queries, should help illustrate the problem, but I can't figure out how to get the ids from the first table that are actually missing in the second table.

select count(*) from abstract_mesh am; #2167101

select count(*) from abstract_mesh am
join abstracts a on am.pmid = a.pmid; #2133848

select 2167101 - 2133848;  #33253

select count(*) from abstract_mesh where pmid is NULL; #33253

So as you can see there are 33,253 rows in abstract_mesh that have no pmids. I simply want to identify which pmids I should be interested in from the abstracts table.

4m1r
  • 12,234
  • 9
  • 46
  • 58
  • https://explainextended.com/2009/09/15/not-in-vs-not-exists-vs-left-join-is-null-sql-server/ if you want some reading – Blag Nov 29 '16 at 01:03
  • Does this answer your question? [Select from one table where not in another](https://stackoverflow.com/questions/7596073/select-from-one-table-where-not-in-another) – outis Mar 27 '22 at 09:44

2 Answers2

1

You can use NOT EXITS to filter out the records, e.g.

select * 
from table1 t1 
where not exists
select * from table2 t2 where t1.pmid = t2.pmid;
Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102
  • Yours isn't valid standard SQL; the table expression involving `table2` would need to be enclosed in parens. Does this really work on mySQL? P.S. you haven't used the OP's table names, the SO mods take a dim view on boilerplate answers. – onedaywhen Nov 29 '16 at 09:13
  • Well, I wanted to suggest an option to OP re how to do it rather than service the answer on the plate. Also, as far as `NOT IN` is concerned, it's considered as a bad practice, this SO answer explains why (http://stackoverflow.com/questions/173041/not-in-vs-not-exists). – Darshan Mehta Nov 29 '16 at 09:35
  • If I had -2 points on SO for SQL I've posted that someone else declared to be 'bad practice' then I'd have... well, the number of points I have :) The page you link to is about optimzation, not semantics. Why isn't premature optimization pejorative in SQL Land like it is for other languages? Why can't SQL vendors build optimizers that recognize semantically-equivalent queries? – onedaywhen Nov 29 '16 at 10:14
0

You need and anti-join. SQL lacks an explicit anti-join operator. Standard SQL has EXCEPT (relational minus) by mySQL lacks this. Here I'm using NOT IN <table expression> to simulate anti-join (though not 100% sure I have the tables the right way round):

  SELECT DISTINCT pmid
    FROM abstract_mesh 
   WHERE pmid NOT IN ( SELECT pmid FROM abstracts );
onedaywhen
  • 55,269
  • 12
  • 100
  • 138