0

I have a dataset of 3 columns with Column 1 being ID(NOT UNIQUE) AND COLUMNS 2 AND 3 being a positive and negative value associated with the id respectively. I am new to R and just trying to figure out how to count the number of pairs of values associated with each id. The table and unique function is not helping since I have to count the pairs. Thanks!

smci
  • 32,567
  • 20
  • 113
  • 146
DataScience
  • 3
  • 1
  • 3

3 Answers3

1

with data.table package

library(data.table)
tdata[, list(paircount = .N) , by = c("ID","COLUMN2","COLUMN3")]

EDIT:

Based on Michael's feedback, I may have misunderstood the question.

tdata[, list(paircount = nrow((unique(.SD)))), by = "ID"]

should get you what you need.

Chris
  • 6,302
  • 1
  • 27
  • 54
  • I am pretty sure that if DataScience wanted to do this, he or she would just used `table()`... – Michael Lawrence Apr 08 '15 at 23:19
  • What's the difference between your method and this Arun's method http://stackoverflow.com/questions/26244685/count-every-possible-pair-of-values-in-a-column-grouped-by-multiple-columns using setorder and setkey? – skan Dec 29 '16 at 01:09
1

I assume you want to count the number of unique pairs for each ID. As @BondedDust mentioned, use interaction:

df$pair <- with(df, interaction(COLUMN2, COLUMN3))
rowSums(xtabs(~ id + pair, df) > 0)
Michael Lawrence
  • 1,031
  • 5
  • 6
0

Maybe try

unique(data[,c("ID", "COLUMN2", "COLUMN3")])

Or, to have results groupes by ID:

by(data = data[,c("COLUMN2", "COLUMN3")],INDICES = data$ID, FUN = unique)
Dominic Comtois
  • 10,230
  • 1
  • 39
  • 61