Excluding multiple results in specific column (SQL JOIN)

Question

I'm taking my first steps in terms of practical SQL use in real life.

I have a few tables with contractual and financial information and the query works exactly as I need - to a certain point. It looks more or less like that:

SELECT /some columns/ from CONTRACTS

Linked 3 extra tables with INNER JOIN to add things like department names, product information etc. This all works but they all have simplish one-to-one relationship (one contract related to single department in Department table, one product information entry in the corresponding table etc).

Now this is my challenge:

I also need to add contract invoicing information doing something like:

inner join INVOICES on CONTRACTS.contnoC = INVOICES.contnoI

(and selecting also the Invoice number linked to the Contract number, although that's partly optional)

The problem I'm facing is that unlike with other tables where there's always one-to-one relationship when joining tables, INVOICES table can have multiple (or none at all) entries that correspond to a single contract no. The result is that I will get multiple query results for a single contract no (with different invoice numbers presented), needlessly crowding the query results.

Essentially I'm looking to add INVOICES table to a query to just identify if the contract no is present in the INVOICES table (contract has been invoiced or not). Invoice number itself could be presented (it is with INNER JOIN), however it's not critical as long it's somehow marked. Invoice number fields remains blank in the result with the INNER JOIN function, which is also necessary (i.e. to have the row presented even if the match is not found in INVOICES table).

SELECT DISTINCT would look to do what I need, but I seemed to face the problem that I need to levy DISTINCT criteria only for column representing contract numbers, NOT any other column (there can be same values presented, but all those should be presented).

Unfortunately I'm not totally aware of what database system I am using.

if you inner join invoices like every other table but do NOT put any columns from the invoices table into the column list and use DISTINCT you should still get your 1 to 1 results of the other tables. Other wise you can use IN or EXISTS instead in a where clause and reference your invoices table there. this question should give you plenty of ways to figure that out. http://stackoverflow.com/questions/173041/not-in-vs-not-exists — Matt, Dec 29 '16 at 22:05
It would be better if you show your whole queries and structures of your tables... — barudo, Dec 29 '16 at 22:08
[edit] your question and add some sample data and the expected output based on that data. [_Formatted_](http://stackoverflow.com/help/formatting) **text** please, [no screen shots](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557). Also tell us which DBMS you are using "something by IBM" isn't specific enough — , Dec 29 '16 at 23:22

score 1 · Answer 1 · edited May 02 '17 at 18:50

1

I would do it this way:

with mainquery as(
<<here goes you main query>>
),

invoices_rn as( 
select *, 
ROW_NUMBER() OVER (PARTITION BY contnoI order by 
<<some column to decide which invoice you want to take eg. date>>) as rn
)

invoices as (
select * from invoices_rn where rn = 1
)

select * from mainquery
left join invoices i on contnoC = i.contnoI

This gives you an ability to get all of the invoice details to your query, also it gives you full control of which invoice you want see in your main query. Please read more about CTEs; they are pretty handy and much easier to understand / read than nested selects.

I still don't know what database you are using. If ROW_NUMBER is not available, I will figure out something else :)

Also with a left join you should use COALESCE function for example:

COALESCE(i.invoice_number,'0')

Of course this gives you some more possibilities, you could for example in your main select do:

CASE WHEN i.invoicenumber is null then 'NOT INVOICED'
else 'INVOICED'
END as isInvoiced

edited May 02 '17 at 18:50

halfer

19,824
17
99
186

answered Dec 29 '16 at 22:16

cybernetic87

285
1
13

1

this is a good technique if OP wants to return a specific invoice in addition to other information. The RDBMs that support common table expressions and the ROW_NUMBER() window function, but you do not actually need your invoices CTE you can join directly to invoices_rn and then add rn = 1 in the ON clause of the join. – Matt Dec 29 '16 at 22:21
You're right. But I always do it in 2 CTE's. Just my habit. – cybernetic87 Dec 29 '16 at 22:23
I also think that for a begginer it looks clearer to do it step by step. I am professional datawarehouse programmer and I am still doing it with 2 CTE's, I find it a lot of easier to read and understand. I've became a CTE freak. Most of my vievs / procedures has tons of CTE's. – cybernetic87 Dec 29 '16 at 22:27
CTEs are wonderful and amazing but they can also confuse an execution plan sometimes. I still use them daily but I still try to constrain to less is better rule. – Matt Dec 29 '16 at 22:31
Matt, I am not allowed to comment your answer. It's good and detailed but I don't like it because you are using inner joins to the invoices table. This way you cut off all the contracts which hasn't got invoices and the OP doesn't want it. – cybernetic87 Dec 29 '16 at 22:42
WHERE EXISTS (SELECT 1 FROM Invoices i WHERE i.contnoI = c.contnoC ) – cybernetic87 Dec 29 '16 at 22:44
here also you are filtering out the not invoiced contracts. – cybernetic87 Dec 29 '16 at 22:45
thanks for the note I mean to change one to LEFT JOIN but forgot when copying in pasting. The others I do what to leave as inner join. When reading OP he mentions wanting only those contracts that have been invoiced which would mean an inner join. But then later suggests wanting specific information from the invoice table so I describe the difference of when to use it as a LEFT or an inner based on the requirement. cheers – Matt Dec 29 '16 at 22:48
OP Wrote something that doesn't make sense: "Invoice number fields remains blank in the result with the INNER JOIN function, which is also necessary (i.e. to have the row presented even if the match is not found in INVOICES table)." Invoice number field won't remain blank, there won't be a contract record at all if there is not invoice to it. – cybernetic87 Dec 29 '16 at 22:54

Matt · Answer 2 · 2016-12-29T22:46:11.233

Seems like the question is still getting some attention and in an effort to provide some explanation here are a few techniques.

If you just want any contract with details from the 1 to 1 tables you can do it similarily to what you have described. the key being NOT to include any column from Invoices table in the column list.

SELECT
   DISTINCT Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
    Contracts c
    INNER JOIN Departments D
    ON c.departmentId = d.Department
    INNER JOIN Product p
    ON c.ProductId = p.ProductId
    INNER JOIN Invoices i
    ON c.contnoC  = i.contnoI

Perhaps a Little cleaner would be to use IN or EXISTS like so:

SELECT
   Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
    Contracts c
    INNER JOIN Departments D
    ON c.departmentId = d.Department
    INNER JOIN Product p
    ON c.ProductId = p.ProductId
WHERE
    EXISTS (SELECT 1 FROM Invoices i WHERE i.contnoI = c.contnoC )

SELECT
   Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
    Contracts c
    INNER JOIN Departments D
    ON c.departmentId = d.Department
    INNER JOIN Product p
    ON c.ProductId = p.ProductId
WHERE
    contnoC IN (SELECT contnoI FROM Invoices)

Don't use IN if the SELECT ... list can return a NULL!!!

If you Actually want all of the contracts and just know if a contract has been invoiced you can use aggregation and a case expression:

SELECT
   Contract, Department, ProductId, CASE WHEN COUNT(i.contnoI) = 0 THEN 0 ELSE 1 END as Invoiced
FROM
    Contracts c
    INNER JOIN Departments D
    ON c.departmentId = d.Department
    INNER JOIN Product p
    ON c.ProductId = p.ProductId
    LEFT JOIN Invoices i
    ON c.contnoC  = i.contnoI
GROUP BY
    Contract, Department, ProductId

Then if you actually want to return details about a particular invoice you can use a technique similar to that of cybercentic87 if your RDBMS supports or you could use a calculated column with TOP or LIMIT depending on your system.

SELECT
   Contract, Department, ProductId, (SELECT TOP 1 InvoiceNo FROM invoices i WHERE c.contnoC  = i.contnoI ORDER BY CreateDate DESC) as LastestInvoiceNo
FROM
    Contracts c
    INNER JOIN Departments D
    ON c.departmentId = d.Department
    INNER JOIN Product p
    ON c.ProductId = p.ProductId
GROUP BY
    Contract, Department, ProductId

score -1 · Answer 3 · answered Dec 29 '16 at 21:59

-1

You can use SELECT ..., invoiced = 'YES' ... where exists ... union SELECT ..., invoiced = 'NO' ... where not exists ...

or you can use a column like "invoiced" with a subquery into invoices to set it's value depending on whether you get a hit or not

answered Dec 29 '16 at 21:59

Chris Caviness

586
3
10

Excluding multiple results in specific column (SQL JOIN)

3 Answers3