SQL : hang two tables on keys of another table, w/o all combinations

Question

Following Problem:

I got two tables (PK=PrimaryKey):

Table B(yes, it is B, not A. EDIT: but A does exist):

ColumnHeader_PK_A    ColumnHeader_PK_B
PK_A_1               PK_B_1
PK_A_1               PK_B_2
PK_A_1               PK_B_3
PK_A_2               PK_B_1
PK_A_2               PK_B_2

table C:

ColumnHeader_PK_A    ColumnHeader_PK_C
PK_A_1               PK_C_1
PK_A_1               PK_C_2
PK_A_1               PK_C_3
PK_A_1               PK_C_4
PK_A_1               PK_C_5
PK_A_2               PK_C_1

I need a result like following:

ColumnHeader_A    ColumnHeader_B    ColumnHeader_C
PK_A_1            PK_B_1            PK_C_1
PK_A_1            PK_B_2            PK_C_2
PK_A_1            PK_B_3            PK_C_3
PK_A_1            Null              PK_C_4
PK_A_1            Null              PK_C_5
PK_A_2            PK_B_1            PK_C_1
PK_A_2            PK_B_2            Null

I think my Problem is pretty simple, but I could not find a solution in nearly 12h of thinking. I work with MS Access, VBA would not be a Problem but I need SQL for Performance. there are a million rows. I just want to hang two lists on one key of another list

So what is the reason `PK_B1_1`,`PK_C_1` is a valid combination, while `PK_B1_1`,`PK_C_2` is not? There is some arbitrary business rule there which you haven't explained. — GolezTrol, Jun 07 '18 at 09:38
Anyway, this is not simple. You have to define some *"index per value"*, so for example `PK_B_2` is the 2nd value in `B` for the key `PK_A_1`, so it's *"index per value"* is `2`. You can then make a `full outer join` between B and C on the primary key *and* that artificial *"index per value"*. Or, if there is indeed a table A, you can make two `left outer joins` on A, instead of a `full outer join`, since [Access doesn't support full outer joins](https://stackoverflow.com/questions/19615177/how-do-i-write-a-full-outer-join-query-in-access), and you'd have to work around that too. — GolezTrol, Jun 07 '18 at 09:46
the combination does not exist. table B and C a seperate tables only sharing the PK_A_x. It is a customer defined database. with the needed result I want to get a better Overview. Edit: Thanks so far, I am going to work with this now — Fabrice, Jun 07 '18 at 09:47
I know it doesn't exist, but you have to define the logic there. It can be as simple as saying, *"the lowest PK_B for PK_A_1 will be on the same row as the lowest PK_C for PK_A_1, etc, and if there is no value in either of the two, I want to see `null`"*, but even that is not trivial, especially in Access. For SQL Server you could use window functions to calculate whether it's the first, second or third value, but I don't think that exists in Access, in which case you'd have to do complicated solutions, possibly with helper tables. — GolezTrol, Jun 07 '18 at 09:50
I'm not bad with SQL at all, but my experience with Access specifically is too limited to solve this problem for you. But It would help if you at least confirm that the business logic above is indeed correct (or give the correct version), and whether there indeed is a table A that has those `PK_A's` (Must be, otherwise they wouldn't be primary keys). That information will certainly help others who may then be able to help you with the actual query. — GolezTrol, Jun 07 '18 at 09:56
the logic is correct. I can imagine your solution. I do not think working around outer joins will be a Problem, I am currently learning to work with "index per value" in VBA. never used "index per value" so far — Fabrice, Jun 07 '18 at 10:06

Gordon Linoff · Accepted Answer · 2018-06-08T02:03:52.833

0

This is ridiculously complicated in MS Access, although not so hard in other databases.

What you need to do is to enumerate the values and then join. You can do the enumeration using subqueries.

select pk_a,
       max(pk_b) as pk_b,
       max(pk_c) as pk_c
from ((select b.pk_a, b.pk_b, NULL as pk_c
              (select count(*)
               from b as b2
               where b2.pk_a = b.pk_a and b2.pk_b <= b.pk_b
             ) as seqnum
       from b
      )
      union all
      (select c.pk_a, NULL as pk_b, NULL as pk_c
              (select count(*)
               from c as c2
               where c2.pk_a = c.pk_a and c2.pk_x <= c.pk_c
             ) as seqnum
       from c
      )
     ) bc
group by pk_a, seqnum;

Note that the columns that are called "pk" show a lack of understanding of relational databases. Primary keys do not have duplicates in their primary table.

edited Jun 08 '18 at 02:03

answered Jun 07 '18 at 11:04

Gordon Linoff

1,242,037
58
646
786

for the max() function the pk have to be numerical, if so, i get this: Edit: my handy makes it dificult to write properly...i will reedit at home in a while – Fabrice Jun 07 '18 at 12:20
@Fabrice . . . `max()` works on basically all column types, so I don't know what you are trying to convey in your comment. – Gordon Linoff Jun 07 '18 at 12:33
just recognized it yes. but still, if column pk_b is filled the column pk_c is empty and otherwise. i can show this evening. the outest select only filters out lower values – Fabrice Jun 07 '18 at 12:39
maybe a 'where' clause would be better in the outter 'select' statement instead of 'max()' – Fabrice Jun 07 '18 at 12:45
@Fabrice . . . The query is written as I intend (although there may be syntax errors). – Gordon Linoff Jun 07 '18 at 12:49
I just edited my question with an attempt of your solution – Fabrice Jun 07 '18 at 22:45
@Fabrice . . . There was a typo. The correlation clause was not referring to the outer query. – Gordon Linoff Jun 08 '18 at 02:04

SQL : hang two tables on keys of another table, w/o all combinations

1 Answers1