-2

I'm looking for some easy to use algorithms in R to label (outlier or not) or score (say, 7.5) outliers row-wise. Meaning, I have a matrix m that contains several rows and I want to identify rows who represent outliers compared to the other rows.

m <- matrix( data = c(1,1,1,0,0,0,1,0,1), ncol = 3 )

To illustrate some more, I want to compare all the (complete) rows in the matrix with each other to spot outliers.

JimBoy
  • 597
  • 8
  • 18
  • 3
    This is very broad. Can you narrow it down? Give an example? – Heroka Sep 30 '15 at 16:07
  • 5
    Read [this](http://stackoverflow.com/help/mcve) and [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and then update your question to conform to these guidelines. – jlhoward Sep 30 '15 at 16:09

1 Answers1

1

Here's some really simple outlier detection (using either the boxplot statistics or quantiles of the data) that I wrote a few years ago.

Outliers

But, as noted, it would be helpful if you'd describe your problem with greater precision.

Edit:

Also you say you want row-wise outliers. Do you mean to say that you're interested in identifying whole rows vs observations within a variable (as is typically done)? If so, you'll want to use some sort of distance metric, though which metric you choose will depend on your data.

alexwhitworth
  • 4,839
  • 5
  • 32
  • 59
  • Thank you, Alex! I have added some information to my original question. I'm not entirely sure I understand your understanding of row-wise outliers. As stated above, I want to compare all the information (meaning, all the cells in a given row) in a given row to all the other rows in the matrix. – JimBoy Sep 30 '15 at 18:56
  • So, (a) your matrix `m` is very unhelpful; (b) your definition of "row outliers" is not an outlier in the traditional sense. It's sensing observations that have large distances between them in p-dim space, eg `dist(m)` – alexwhitworth Sep 30 '15 at 20:19
  • Thank you, Alex! This is exactly what I was looking for! But how do I read the dist-table? BTW: The matrix I provided is very close to what I actually have. – JimBoy Sep 30 '15 at 20:29