0

This has to be a solved problem but I don't know the right terms to search for on google. So, will explain the problem here.

I have the following dataset that has two different identifiers for users (say id1 and id2).

+------+-----+-------+
| id1  | id2 | value |
+------+-----+-------+
| 1    | 11  | blah1 |
| 1    | 12  | blah2 |
| 2    | 13  | blah3 |
| null | 14  | blah4 |
+------+-----+-------+ 

There is a one-to-many relationship between id1 and id2 and so users with id2 11 and 12 are actually the same users. I want to replicate the rows for such users so that the value is associated with each id2. The resulting dataset would then look like

+------+-----+-------+
| id1  | id2 | value |
+------+-----+-------+
| 1    | 11  | blah1 |
| 1    | 12  | blah2 |
| 2    | 13  | blah3 |
| null | 14  | blah4 |
| 1    | 12  | blah1 |
| 1    | 11  | blah2 |
+------+-----+-------+ 

As you can see, the value blah1 is now associated with both 11 and 12 id2, as is the value blah2.

There must be some kind of self-join that does that but I am not aware of what it is called (SQL newbie). Would appreciate if some one could point me in the right direction.

Nik
  • 5,515
  • 14
  • 49
  • 75
  • 1
    Possible duplicate of [How to implement one-to-one, one-to-many and many-to-many relationships while designing tables?](https://stackoverflow.com/questions/7296846/how-to-implement-one-to-one-one-to-many-and-many-to-many-relationships-while-de) – E. Earl Sep 25 '18 at 16:56
  • @E.Earl : my data is not in a normalized form. I can break that into two separate datasets but even after splitting it into `Dataset1( id2, value)` and `Dataset2( id1, id2)`, I still don't know how to specify a join resulting in the above dataset. – Nik Sep 25 '18 at 17:08
  • On second thoughts, that might work. If I split the dataset into `Dataset1 (id1, value)` and `Dataset2(id1, id2)` and then do a `Dataset1 LEFTJOIN Dataset2 ON id1`. Let me try that. – Nik Sep 25 '18 at 17:12
  • "I don't know the right terms to search for on google" Please always google many clear, concise & specific versions/phrasings of your question/problem/goal with & without your particular strings/names & read many answers. Add relevant keywords you discover to your searches. If you don't find an answer then post, using 1 variant search as title & keywords for your tags. See the downvote arrow mouseover text. When you do have a non-duplicate code question to post please read & act on [mcve]. Yes, writing clearly is difficult. Make the effort. You don't need to know what things are called. – philipxy Sep 26 '18 at 03:02
  • Please clarify via post edits, not comments. PS We do not need constraints or NFs to query. The meanings of queries are built from conditions & the meanings of base tables per the relational operators we use. [Is there any rule of thumb to construct SQL query from a human-readable description?](https://stackoverflow.com/a/33952141/3404097) In "There is a one-to-many relationship between id1 and id2 and so users with id2 11 and 12 are actually the same users" the "and so" is backwards. Rows of both tables talk about users identified by ids & the same users participate in both, so ids are 1:1. – philipxy Sep 26 '18 at 03:16

2 Answers2

1

Well, you can self join, its totally permitted... Join will link columns based on a key connection (in the general case) Notice that in this case will also need union, because you'd like more lines, not columns

SELECT t.*
FROM
table t
INNER JOIN table t2 ON t.id1 = t2.id1 AND t. id2 != t2.id2

UNION

SELECT t.*
FROM
table t
INNER JOIN table t2 ON t.id1 = t2.id1 AND t. id2 = t2.id2
Guy Louzon
  • 1,175
  • 9
  • 19
0

You can generate the rows using join for this purpose:

select i.id1, i.id2, iv.value
from (select distinct id1, value from t) iv join
     (select distinct id1, id2 from t) i 
     on iv.id1 = i.id1 ;

Actually, the second select distinct is probably not necessary (unless your original data has duplicates which is would if you added these rows back into the table), but I think it make the query clearer. This should also work:

select t.id1, t.id2, iv.value
from (select distinct id1, value from t) iv join
     t 
     on iv.id1 = t.id1 ;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786