This has to be a solved problem but I don't know the right terms to search for on google. So, will explain the problem here.
I have the following dataset that has two different identifiers for users (say id1
and id2
).
+------+-----+-------+
| id1 | id2 | value |
+------+-----+-------+
| 1 | 11 | blah1 |
| 1 | 12 | blah2 |
| 2 | 13 | blah3 |
| null | 14 | blah4 |
+------+-----+-------+
There is a one-to-many relationship between id1
and id2
and so users with id2
11 and 12 are actually the same users. I want to replicate the rows for such users so that the value
is associated with each id2
. The resulting dataset would then look like
+------+-----+-------+
| id1 | id2 | value |
+------+-----+-------+
| 1 | 11 | blah1 |
| 1 | 12 | blah2 |
| 2 | 13 | blah3 |
| null | 14 | blah4 |
| 1 | 12 | blah1 |
| 1 | 11 | blah2 |
+------+-----+-------+
As you can see, the value blah1
is now associated with both 11 and 12 id2
, as is the value blah2
.
There must be some kind of self-join
that does that but I am not aware of what it is called (SQL newbie). Would appreciate if some one could point me in the right direction.