Transform ids -> items to {pairs of ids} -> items

Question

I have a data.frame like this:

x1 <- data.frame(id=1:3,item=c("A","B","A","B","C","D"))
x1[order(x1$item),]
  id item
1  1    A
3  3    A
2  2    B
4  1    B
5  2    C
6  3    D

I want to get :

id1=c(1,2,1,3,2,3)
id2 = c(2,1,3,1,3,2)
A=c(0,0,1,1,0,0)
B=c(1,1,0,0,0,0)
C = 0
D=0
datawanted <- data.frame(id1,id2,A,B,C,D)
  id1 id2 A B C D
1   1   2 0 1 0 0
2   2   1 0 1 0 0
3   1   3 1 0 0 0
4   3   1 1 0 0 0
5   2   3 0 0 0 0
6   3   2 0 0 0 0

if person1 and person2 both have B,then in the datawanted dataframe,column A ,got 1,else get 0.

Can someone give me some suggestions or functions in R,to deal with this problem?

id2 is the same as id1, person1 and person2 have a contact on B,just like it. — chunjin, Aug 05 '16 at 02:50
person2 and person3 do not hava a contact,so it indicates zero — chunjin, Aug 05 '16 at 06:23

score 4 · Accepted Answer · edited May 23 '17 at 12:08

4

Cool question. You have a bipartite graph, so following Gabor's tutorial...

library(igraph)
g = graph_from_edgelist(as.matrix(x1))
V(g)$type = grepl("[A-Z]", V(g)$name)

For OP's desired output, first we can extract the incidence matrix:

gi = get.incidence(g)
#   A B C D
# 1 1 1 0 0
# 2 0 1 1 0
# 3 1 0 0 1

Note (thanks @thelatemail), that if you don't want to use igraph, you can get to gi as table(x1).

Then, we look at the combinations of ids:

res = t(combn(nrow(gi), 2, function(x) c(
    as.integer(rownames(gi)[x]), 
    pmin( gi[x[1], ], gi[x[2], ] ) 
)))

dimnames(res) <- list( NULL, c("id1", "id2", colnames(gi)))
#      id1 id2 A B C D
# [1,]   1   2 0 1 0 0
# [2,]   1   3 1 0 0 0
# [3,]   2   3 0 0 0 0

This essentially is the OP's desired output. They had included redundant rows (e.g., 1,2 and 2,1).

Fun reason to use a graph (ht Chris):

V(g)$color <- ifelse(V(g)$type, "red", "light blue")
V(g)$x     <- (1:2)[ V(g)$type + 1 ]
V(g)$y     <- ave(seq_along(V(g)), V(g)$type, FUN = seq_along)
plot(g)

Or, apparently this can be done more or less like

plot(g, layout = layout.bipartite(g)[,2:1])

edited May 23 '17 at 12:08

Community

1
1

answered Aug 05 '16 at 03:39

Frank

66,179
8
96
180

1

Isn't the first part of this just `table(x1)` ? – thelatemail Aug 05 '16 at 03:42
@thelatemail Sure, but it's a graph, so might as well store it as one. If the OP isn't done with their analysis after this, they might take advantage of whatever other tools igraph has (... though I'm not that familiar with them myself). Good point, though, I've edited to reflect it. – Frank Aug 05 '16 at 03:43
1

thanks,if id changes like c(1,3,4),and the method you give may cause subscript outbounding. `combn` and pmin did give me some help.I need to think about if the id is not corrospended with rownumbers,and how can it work? – chunjin Aug 05 '16 at 07:56
1

I use the code like this ,should this cause any question? `res[,1] <- x1$id[res[,1]]` `res[,2] <- x1$id[res[,2]]` – chunjin Aug 05 '16 at 08:11
@chunjin Good catch. Yes, I think your way works. I've also edited to show a different way above, changing the construction of `res` to use `as.integer(rownames(gi)[x])` instead of `x`. – Frank Aug 06 '16 at 13:15

Transform ids -> items to {pairs of ids} -> items

I have a data.frame like this:

I want to get :

1 Answers1