This analysis follows the general split-apply-combine approach, where the data re split by week, graph functions are applied, and then the results combined together. There are several tools for this, but below uses base R, and data.table
.
Base R
First set data-class for your data, so that term last two weeks has meaning.
# Set date class and order
d$week <- as.Date(d$week, format="%m/%d/%Y")
d <- d[order(d$week), ]
d <- d[d$timestalked > 0, ] # remove edges // dont need to do this is using weights
Then split and apply graph functions
# split data and form graph for eack week
g1 <- lapply(split(seq(nrow(d)), d$week), function(i)
graph_from_data_frame(d[i,]))
# you can then run graph functions to extract specific measures
(grps <- sapply(g1, function(x) eigen_centrality(x,
weights = E(x)$timestalked)$vector))
# 2010-01-01 2010-01-08 2010-01-15
# A 0.5547002 0.9284767 1.0000000
# B 0.8320503 0.3713907 0.7071068
# C 1.0000000 1.0000000 0.7071068
# Aside: If you only have one function to run on the graphs,
# you could do this in one step
#
# sapply(split(seq(nrow(d)), d$week), function(i) {
# x = graph_from_data_frame(d[i,])
# eigen_centrality(x, weights = E(x)$timestalked)$vector
# })
You then need to combine in the the analysis on all the data - as you only have to build two further graphs, this is not the time-consuming part.
fun1 <- function(i, name) {
x = graph_from_data_frame(i)
d = data.frame(eigen_centrality(x, weights = E(x)$timestalked)$vector)
setNames(d, name)
}
a = fun1(d, "alldata")
lt = fun1(d[d$week %in% tail(unique(d$week), 2), ], "lasttwo")
# Combine: could use `cbind` in this example, but perhaps `merge` is
# safer if there are different levels between dates
data.frame(grps, lt, a) # or
Reduce(merge, lapply(list(grps, a, lt), function(x) data.frame(x, nms = row.names(x))))
# nms X2010.01.01 X2010.01.08 X2010.01.15 alldata lasttwo
# 1 A 0.5547002 0.9284767 1.0000000 0.909899 1.0
# 2 B 0.8320503 0.3713907 0.7071068 0.607475 0.5
# 3 C 1.0000000 1.0000000 0.7071068 1.000000 1.0
data.table
It is likely that the time-consuming step will be explicitly split-applying the function over the data. data.table
should offer some benefit here, especially when the data becomes large, and/or there are more groups.
# function to apply to graph
fun <- function(d) {
x = graph_from_data_frame(d)
e = eigen_centrality(x, weights = E(x)$timestalked)$vector
list(e, names(e))
}
library(data.table)
dcast(
setDT(d)[, fun(.SD), by=week], # apply function - returns data in long format
V2 ~ week, value.var = "V1") # convert to wide format
# V2 2010-01-01 2010-01-08 2010-01-15
# 1: A 0.5547002 0.9284767 1.0000000
# 2: B 0.8320503 0.3713907 0.7071068
# 3: C 1.0000000 1.0000000 0.7071068
Then just run the function over the full data / last two weeks as before.
There are differences between the answers, which is down to how we use the use the weights
argument when calculating the centralities, whereas the others don't use the weights.
d=structure(list(from = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A",
"B", "C"), class = "factor"), to = structure(c(2L, 3L, 2L, 3L,
2L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("A",
"B", "C"), class = "factor"), timestalked = c(0L, 1L, 0L, 4L,
1L, 2L, 0L, 1L, 0L, 2L, 1L, 0L, 1L, 2L, 1L, 0L, 0L, 0L), week = structure(c(1L,
1L, 3L, 3L, 2L, 2L, 1L, 1L, 3L, 3L, 2L, 2L, 1L, 1L, 3L, 3L, 2L,
2L), .Label = c("1/1/2010", "1/15/2010", "1/8/2010"), class = "factor")), .Names = c("from",
"to", "timestalked", "week"), class = "data.frame", row.names = c(NA,
-18L))