I have a dataframe with g1, g2 representing two groups, and a val column, representing a count of how many items from g1 can also be found in g2.
## Input dataframe
data.frame(
g1 = c('a','a','a','b','b','b','c','c','c','d'),
g2 = c('a','b','c','a','b','c','a','b','c','d'),
val = c(10,4,1,4,5,0,1,0,3,4),
stringsAsFactors = FALSE
)
I'm having trouble formatting the dataframe as below. I can create an empty, named matrix with distinct column/row names from g1/g2, and iterate over each row in the input dataframe, writing it's value to the matching g1: row-id g2: column-id combination, but that seems inefficient; I was wondering if any of the libraries provided a method to automate it?
## Output overlap matrix
data.frame(a = c(10,4,1,0),
b = c(4,5,0,0),
c = c(1,0,3,0),
d = c(0,0,0,4),
row.names = c('a','b','c','d'))
A similar question to overlap between groups has been asked before, but there- we have a list of groups, and items in it, and want to find how many items overlap between groups.
Here, I know how many items overlap between groups, but am having trouble formatting it the correct way.