2

My goal is to get a query written. I have three tables, A, B and C. The tables are written such that A.bID = B.bID, and B.cID = C.cID. This basically allows me to write a query where I link a record from a to b, and link the b record to a record from c. So far so good, simple query.

What my problem is... one of the columns included in the query (let's call it C.col3) has to have unique values; the values in this column can only show up once in the query result, but other columns from the other tables do not have this requirement.

Can anybody help me write this query?

Thanks...

Update 1:

Here is the table layout (sorry, I have to use generic names)

Table A
aID, bID, aCol1, aCol2, aCol3 ... aCol10

Table B
bID, cID, bCol1, bCol2, bCol3 ... bCol10

Table C
cID, cCol1, cCol2, col3, cCol4 ... cCol10

Without the unique value constraint in col3, I would write the query like this:

SELECT
    A.aID, A.bID, A.aCol1 ... A.aCol10,
    B.bID, B.cID, B.bCol1 ... B.bCol10,
    C.cID, C.cCol1, C.cCol2, C.col3 ... C.cCol10
FROM
    A, B, C
WHERE 
    A.bID = B.bID AND B.cID = C.cID

... but of course that doesn't make sure the that values in C.col3 are unique.

Update 2: More info...
Table A and Table B have a one to many relationship; A is the "header", B is the "item".
Table B and Table C have a one to one relationship.

These tables are part of a caching mechanism, so lots of data that looks similar, but is still different in some cols.

Since A is the header, most of the duplicate values will be found in A.

I first need to order the rows by A.aID, but then after that I only need the first rows returned, where the value for C.col3 does not appear in a previous row for that col.

Does that make things a little clearer, or am I still not making any sense? :)

Final Update:

I chose Bartosz Klimek's answer as it was the closest to what I needed; I just had to modify the nested join clause in the middle.

Thank you all for your help!

user85116
  • 4,422
  • 7
  • 35
  • 33
  • I should add that I'm aware I can probably do something like a nested select statement ... (where c.col3 in (select unique(c.col3) from...)) etc., but wondering if there is a better / proper way to do this. – user85116 Mar 31 '09 at 14:17
  • As others have posted, DISTINCT is probably what you're after. However, from the sounds of things, your problem may be more complicated than that. Can you post some more information about your tables, their contents and the actual results you'd like to see? – Robin Day Mar 31 '09 at 14:20
  • I agree with Robin, we're not going to be able to help you without more information. Just doing a DISTINCT is probably not going to be the answer you're looking for. – Nathan Koop Mar 31 '09 at 14:24
  • Yes, should have made this more clear... the actual query contains over 20 columns, plus the column that needs distinct values, so using a group by with aggregate functions on the other cols wont' work. I'm using JDBC. – user85116 Mar 31 '09 at 15:09
  • I'm not very familiar with the java side of development, but isn't JDBC a database driver standard? It's not a DBMS. Could you possibly provide some data to help us assist you. As JohnFx points out, the question, as currently provided, is flawed. – Nathan Koop Mar 31 '09 at 15:22
  • Yes, JDBC is Java's layer on top of specific database drivers. But this is just a general sql problem, and (shouldn't ??) have anything to do with which database I'm using; it's just a general query. I added more details to my question. – user85116 Mar 31 '09 at 15:24
  • I have added an answer below to hopefully help explain why the question doesn't work – Nathan Koop Mar 31 '09 at 15:34
  • Can you add some example with a few rows of sample data for the source tables and an example of the output you want? – JohnFx Mar 31 '09 at 15:51

5 Answers5

3

I'm going to quickly make a little example of what you're trying to do and hopefully this will help clarify why what you are asking (currently) is impossible.

If you had a Customer Table [CustomerID, CustomerName] and an Orders Table [OrderID, CustomerID, DollarAmount]

If you wanted all orders for customers:

SELECT CustomerName, OrderID, DollarAmount
FROM Customer, Orders
WHERE Customer.CustomerID = Orders.CustomerID

it would return

 "Acme Corp.", 1, $2300
 "Acme Corp.", 2, $3022
 "A company",  3, $1234

Everything is good.

But the equivalent of your question is asking for this query, but with unique CustomerNames. What would you display for OrderID and DollarAmount beside "Acme Corp"?

You could use aggregates to display something,

SELECT CustomerName, MAX(OrderID), SUM(DollarAmount)
FROM Customer, Orders
WHERE Customer.CustomerID = Orders.CustomerID
GROUP BY Orders.CustomerID

But I believe that you mentioned that you do not want to use aggregates.

Does this explain the issue clearly?

Nathan Koop
  • 24,803
  • 25
  • 90
  • 125
  • I think what Bartosz Klimek is writing is close to the mark; you are assuming that I want the other rows returned (and having trouble visualing what goes in the column "col3"); I don't want that row anymore then, at all, if the value in col3 for that row can already be found in another row. – user85116 Mar 31 '09 at 15:39
  • BTW, I do appreciate your help; I'm having trouble explaining my problem in a way that is "understandable", but yet doesn't reveal info. that I'm not at liberty to reveal :) – user85116 Mar 31 '09 at 15:40
  • Me too on the "is impossible".. i mean +1. – Mark Nold Mar 31 '09 at 15:41
  • @unknown -- so look at Nathan's example and tell us, what output do you actually want in that situation, and what's the logic that determines it? – Dave Costa Mar 31 '09 at 15:49
  • Please see my second update; I don't want that row at all, if the customer's name can be found in a previous row. – user85116 Mar 31 '09 at 16:04
2

I started to post another answer, but after rethinking it I deleted it. If I am reading the question correctly, I think this is an impossible/illogical question. Let me explain with an example. if I read this wrong, please clarify the question with an exampe of what you are looking for.

Table A

BID COL1
1   Value1
2   Value1
3   Value2

Table B

BID CID COL 2
1   4   ValueX
2   5   ValueY
3   6   ValueZ

Table C

CID COL3
4   Value#
5   Value@
6   Value~

Expected Result

A.Col1   A.BID B.BID B.CID B.COL2 C.CID C.COL3
Value1?? 1     1     4     ValueX 4     Value#
Value1?? 2     1     5     ValueY 5     Value@ 
Value2   3     3     6     ValyeZ 6     Value~

Per the question you don't want value1 repeated in the first column, but what do you propose goes into the second row where it would normally be repeated if you didn't have the unique constraint?

Community
  • 1
  • 1
JohnFx
  • 34,542
  • 18
  • 104
  • 162
  • +1, I agree completely, the question is flawed/impossible to answer. – Nathan Koop Mar 31 '09 at 15:07
  • BTW: Sorry for the formatting in my example. It looks right in the preview, but it totally mangled my tables after I saved them. – JohnFx Mar 31 '09 at 15:25
  • Finally got the formatting right. I guess the preview shows embedded HTML tables and the regular view strips them out. Who knew? – JohnFx Mar 31 '09 at 15:55
1
SELECT A.*, B.*, C.*
  FROM C
    JOIN B ON B.cID = C.cID
    JOIN A ON A.bID = B.bID
    JOIN
    (
      SELECT id = min(aID)
        FROM C
          JOIN B ON B.cID = C.cID
          JOIN A ON A.bID = B.bID
        GROUP BY col3
    ) D ON D.id = A.aID

Note that the subquery at the end assures that for each col3 value you will have at most one record in the final resultset. The record selected is the one with the minimal aID. Obviously I assume that aID, bID and cID are primary keys of A, B and C, respectively.

Bartosz Klimek
  • 555
  • 2
  • 5
0
select distinct c.col3 from c inner join b on c.cID = b.cID inner join a on b.bID = a.bID
Eppz
  • 3,178
  • 2
  • 19
  • 26
  • This won't work for him, he's looking to retrieve multiple columns and have only distinct on c.col3, not on all columns. – Nathan Koop Mar 31 '09 at 14:28
0

If you need other values from the tables as well, use:

select max(a.col1), sum(b.col2), col3 from a, b, c
where A.bID = B.bID, and B.cID = C.cID
group by C.col3

On all columns not grouped by you need to use aggregate functions such as

  • AVG: Average of the column.
  • COUNT: Number of records.
  • MAX: Maximum of the column.
  • MIN: Minimum of the column.
  • SUM: Sum of the column.
Ralph M. Rickenbach
  • 12,893
  • 5
  • 29
  • 49