I have the following matrix
in R, but with 22k entries.:
> unbalanced
row col
[1,] 1 3
[2,] 4 5
[3,] 4 6
[4,] 5 6
[5,] 9 10
[6,] ...
Is there a way to divide this matrix into sets
of intersecting data?
What I mean is:
If there is an intersection between any rows, I want a set with the resulting union of those rows.
Ideally in the end I would have a list of non intersecting sets, that contain all the data from my original matrix
.
Something like this (based on the above example):
[1,] 1 3
[2,] 4 5 6
[3,] 9 10
I have implemented something similar in python in the past (using iterations), but R is quite a different "beast", and the way perceive it iterations like for
loops can and should be avoided.
Thank you in advance for any pointers you may provide.
Update:
Using @A. Webb's answer, using aggregate(col~row,unbalanced,FUN=list)
gets me close to what I want, but there is still a missing detail that might not have been evident from the original question.
The mentioned solution, provides a list of sets which contain common data (which is what I called overlapping sets in the comments section).
To illustrate, further down in the list I get this:
row col
...
[160,] 160 c(161, 162, 194, 559, 1195)
[161,] 161 c(162, 194, 559, 1195)
...
What I needed was the union of these two sets, since their intersection is different from ∅ (empty set). I should also add that I do not require the column named "row", so any solution that discards it is OK for me. I only need a list with the sets.