-3

I would like to get some help with the following problem and understand how this can be done in R efficiently. The header is given in the data frame.

Component, TLA
C1, TLA1
C2, TLA1
C1, TLA2
C3, TLA2
C4, TLA3
C5, TLA3

Notice that C1 is a component of TLA1 and TLA2.

I would like to form groups of mutually exclusive subsets and create a new column called group for this subset. For the above data, the subsets and the new group column value will be like so:

Component, TLA, Group
C1, TLA1, 1
C2, TLA1, 1
C1, TLA2, 1
C3, TLA2, 1
C4, TLA3, 2
C5, TLA3, 2

Appreciate any help on this. I could have looped through the observations and tried some logic but I did not try that yet.

Neal Fultz
  • 9,282
  • 1
  • 39
  • 60
  • 1
    What is the logic for determining the group? – adaien Mar 28 '16 at 01:10
  • 2
    You should try something before you post a question. SO is to help you when you get stuck, not do your work for you. – Rich Scriven Mar 28 '16 at 01:10
  • You can also check out this [this](http://stackoverflow.com/questions/18799901/data-frame-group-by-column) link – steveb Mar 28 '16 at 01:30
  • Hello Hadd; Yes. I tried but my brain went to Perl and how in Perl I can build an anyonymous hash to perform this task. But the connections with igraph is something that I could not have thought about (that was suggested by Neal Fultz and others. – Satish Vadlamani Mar 29 '16 at 04:48

1 Answers1

0

What you want are the components of a bipartite graph. Using the igraph package:

> components(graph_from_edgelist(as.matrix(df)))
$membership
   C1  TLA1    C2  TLA2    C3    C4  TLA3    C5 
    1     1     1     1     1     2     2     2 

$csize
[1] 5 3

$no
[1] 2

which we can plug back into the original data frame:

> i <- components(graph_from_edgelist(as.matrix(df)))$membership
> df$group <- i[as.character(df$Component)]
> df
  Component   TLA group
1        C1  TLA1     1
2        C2  TLA1     1
3        C1  TLA2     1
4        C3  TLA2     1
5        C4  TLA3     2
6        C5  TLA3     2
Neal Fultz
  • 9,282
  • 1
  • 39
  • 60