I have a combined packet capture from both sides of a network connection in WireShark. The capture is exported as a CSV file and every row contains among other things a unique ID and timestamp. Because I capture from both sides that means I will have two rows of every ID containing the send timestamp and receive timestamp. What I want to do is calculate the delay by subtracting these values. I have managed to do it but it takes roughly 12 seconds to go through my list of 17000 packets and I have 15 lists in total which would equal 3 minutes execution time with the following code:
data <- read.csv("normal-novpn.csv", sep=",", numerals="no.loss", header=TRUE)
ID = data.matrix(data[,7], rownames.force = NA)
time = data.matrix(as.double(as.character(data[,2])), rownames.force = NA)
time = time*1000000 # Time is now in microseconds
len <- nrow(ID)
mat <- matrix(,nrow=len,ncol=2)
for(i in 1:len){
d <- unlist(strsplit(ID[i], " "))
mat[i,1] <- as.numeric(gsub('[()]','',d[2]))
mat[i,2] <- time[i]
}
delay = vector(length=len/2)
k <- 1
for(i in 1:len){
for(j in i:len){
if(mat[i,1] == mat[j,1] && mat[j,2] > mat[i,2]){
delay[k] <- mat[j,2] - mat[i,2]
k <- k+1
}
}
}
The rows in the CSV file are ordered in respect to time and a row looks like this:
"32","1505997726.015245358","10.0.10.70","10.0.10.1","UDP","214","0xa5f0 (42480)","50414 > 5201 Len=172"
where the timestamp is: "1505997726.015245358" and the ID is: "0xa5f0 (42480)"
My question is if I can do this more efficiently to reduce the execution time.
Update: This is a link to one of my CSV files containing the 17000 lines: https://justpaste.it/1bjoy
Here is a smaller file with only 10 lines of data + header. One thing to mention is that it is not true for all files that the duplicate IDs are next to each other in the list.
"No.","Time","Source","Destination","Protocol","Length","Identification","Info"
"120","1505984967.366049706","10.0.0.50","10.0.0.35","UDP","214","0x8dab (36267)","46670 > 5201 Len=172"
"123","1505984967.366440","10.0.0.50","10.0.0.35","UDP","214","0x8dab (36267)","46670 > 5201 Len=172"
"124","1505984967.386478504","10.0.0.50","10.0.0.35","UDP","214","0x8dac (36268)","46670 > 5201 Len=172"
"125","1505984967.386606","10.0.0.50","10.0.0.35","UDP","214","0x8dac (36268)","46670 > 5201 Len=172"
"130","1505984967.406353133","10.0.0.50","10.0.0.35","UDP","214","0x8db0 (36272)","46670 > 5201 Len=172"
"131","1505984967.406555","10.0.0.50","10.0.0.35","UDP","214","0x8db0 (36272)","46670 > 5201 Len=172"
"132","1505984967.426372842","10.0.0.50","10.0.0.35","UDP","214","0x8db1 (36273)","46670 > 5201 Len=172"
"133","1505984967.426558","10.0.0.50","10.0.0.35","UDP","214","0x8db1 (36273)","46670 > 5201 Len=172"
"134","1505984967.446282356","10.0.0.50","10.0.0.35","UDP","214","0x8db6 (36278)","46670 > 5201 Len=172"
"135","1505984967.446555","10.0.0.50","10.0.0.35","UDP","214","0x8db6 (36278)","46670 > 5201 Len=172"
Update 2: The order of the rows must remain as I will perform additional calculations of the new values. The first column "No." indicates the packet number as counted by WireShark and must be increasing while traversing down the list.