1

I'm really new to R and trying to solve a, for me, challenging problem.

I have a .csv file containing 22.388 rows with comma separated integers. I want to find all possible combinations of pairs of the integers for each row separately and list them pair for pair, so that I'll be able to make a visual representation of them as clusters.

I've tried installing the combinat package for R but I can't seem to solve the problem.

An example from my file would be

2 13

2 8 6

Which should be listed in possible combinations of pairs like this.

2, 13
2, 8
2, 6
8, 6

Cœur
  • 37,241
  • 25
  • 195
  • 267

2 Answers2

0

combn gives the combinations of the vector elements. paste the combinations together with apply:

x <- c(2, 13)
y <- c(2, 8, 6)
apply(combn(x, 2), 2, paste, collapse=' ')
[1] "2 13"

Loop over these:

unlist(sapply(list(x, y), function(x) apply(combn(x, 2), 2, paste, collapse=' ')))
## [1] "2 13" "2 8"  "2 6"  "8 6" 
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
0

Sample input - replace textConnection(...) with your csv filename.

csv <- textConnection("2,13
2,8,6")

This reads the input into a list of values:

input.lines  <- readLines(csv)
input.values <- strsplit(input.lines, ',')

This creates a nested list of pairs:

pairs <- lapply(input.values, combn, 2, simplify = FALSE)

This puts everything in a nice matrix of integers:

pairs.mat <- matrix(as.integer(unlist(pairs)), ncol = 2, byrow = TRUE)
pairs.mat
#      [,1] [,2]
# [1,]    2   13
# [2,]    2    8
# [3,]    2    6
# [4,]    8    6
flodel
  • 87,577
  • 21
  • 185
  • 223
  • Thank you so much, I can see where it's going. Only problem now is that my file contains 22.388 rows, and to be specific I have to find all possible combinations of pairs for each line, seperately and not just the entire total combination of pairs. I've edited my question to reflect that more clearly. Can you help - I guess I need to do a loop or something? Thanks! – Matias Bruhn May 09 '13 at 07:45
  • @Matias: How is that different from what you already asked for? As an example, see that `13` (line 1 in the file) is not paired with `8` or `6` (line 2 in the file). – flodel May 09 '13 at 22:45
  • If what you need is a list of two-column matrices, then try: `lapply(pairs, function(x) matrix(as.integer(unlist(x)), ncol = 2, byrow = TRUE))` – flodel May 09 '13 at 22:48