how to select a matrix column based on column name

Question

I have a table with shortest paths obtained with:

g<-barabasi.game(200)
geodesic.distr <- table(shortest.paths(g))
geodesic.distr
#   0    1    2    3    4    5    6    7 
# 117  298 3002 2478 3342 3624  800   28

I then build a matrix with 100 rows and same number of columns as length(geodesic.distr):

geo<-matrix(0, nrow=100, ncol=length(unlist(labels(geodesic.distr))))
colnames(geo) <- unlist(labels(geodesic.distr))

Now I run 100 experiments where I create preferential attachment-based networks with

for(i in seq(1:100)){
    bar <- barabasi.game(vcount(g))
    geodesic.distr <- table(shortest.paths(bar))
    distance <- unlist(labels(geodesic.distr))
    for(ii in distance){
        geo[i,ii]<-WHAT HERE?
    }
}

and for each experiment, I'd like to store in the matrix how many paths I have found.

My question is: how to select the right column based on the column name? In my case, some names produced by the simulated network may not be present in the original one, so I need not only to find the right column by its name, but also the closest one (suppose my max value is 7, I may end up with a path of length 9 which is not present in the geo matrix, so I want to add it to the column named 7)

Please provide a reproducible example. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Roman Luštrik, Mar 27 '14 at 12:25
thank you, in order to replicate this code you just have to make a new graph g like: g<-barabasi.game(200) — user299791, Mar 27 '14 at 14:01
I have just deleted it, it's not relevant for the discussion... thanks for pointing it out. — user299791, Mar 28 '14 at 10:53

score 1 · Accepted Answer · answered Mar 28 '14 at 14:07

There is actually a problem with your approach. The length of the geodesic.distr table is stochastic, and you are allocating a matrix to store 100 realizations based on a single run. What if one of the 100 runs will give you a longer geodesic.distr vector? I assume you want to make the allocated matrix bigger in this case. Or, even better, you want run the 100 realizations first, and allocate the matrix after you know its size.

Another potential problem is that if you do table(shortest.paths(bar)), then you are (by default) considering undirected distances, will end up with a symmetric matrix and count all distances (expect for self-distances) twice. This may or may not be what you want.

Anyway, here is a simple way, with the matrix allocated after the 100 runs:

dists <- lapply(1:100, function(x) {
  bar <- barabasi.game(vcount(g))
  table(shortest.paths(bar))
})
maxlen <- max(sapply(dists, length))
geo <- t(sapply(dists, function(d) c(d, rep(0, maxlen-length(d)))))

how to select a matrix column based on column name

1 Answers1