I am new to R and have 1024 rows of data with 3 columns of numerical data. I have created a plot3d and I need to identify the row number of an outlier which stands out like a sore thumb in plot3D but in any other graphs is not visable.
Asked
Active
Viewed 605 times
0
-
Please post a sample of your data, code you've tried so far, and what your parameters are for being an outlier. – Rich Scriven Mar 31 '14 at 04:13
-
Here's some of my data – user3479729 Mar 31 '14 at 04:26
-
-1.5454 -0.6855 0.1003 -0.5284 -0.4065 -0.2645 -1.0868 -0.5329 0.1623 -1e-04 -0.9569 -2.0055 0.389 -0.8356 -2.2085 0.5326 0.0391 -0.5044 -1.8376 -0.7834 0.3436 – user3479729 Mar 31 '14 at 04:27
3 Answers
1
Hopefully this helps get the job done for you.
> data <- c(-1.5454, -0.6855, 0.1003, -0.5284, -0.4065, -0.2645,
-1.0868, -0.5329, 0.1623, -1e-04, -0.9569, -2.0055,
0.389, -0.8356, -2.2085, 0.5326, 0.0391, -0.5044,
-1.8376, -0.7834, 0.3436)
## original data
> dd <- data.frame(matrix(data, ncol = 3, byrow = TRUE))
## find the row number of the largest row maximum
> which.max(apply(dd, 1, max))
[1] 6
## Use the previous line to remove the unwanted row
> newDd <- dd[ -which.max(apply(dd, 1, max)), ]
## plot the two data frames together to see the difference
> library(plot3D)
> par(mfrow = c(1, 2))
> with(dd, scatter3D(X1, X2, X3, phi = 0, theta = 50, bty = "g",
col = gg.col(100), pch = 19, cex = 2, colkey = FALSE))
> with(newDd, scatter3D(X1, X2, X3, phi = 0, theta = 50, bty = "g",
col = gg.col(100), pch = 19, cex = 2, colkey = TRUE))

Rich Scriven
- 97,041
- 11
- 181
- 245
0
Use the built-int arrayInd
to find the maximum value (or minimum):
arrayInd(which.max(as.matrix(df)), .dim = dim(df))
For example, we are going to make a 3 column dataframe with one sore thumb.
df <- data.frame(structure(replicate(3, runif(1024, 0, 1), simplify = FALSE), .Names = c('one', 'two', 'three')))
df[50, 2] <- 10
Now we get
arrayInd(which.max(as.matrix(df)), .dim = dim(df))
# [,1] [,2]
# [1,] 50 2
And we see the offender is in row 50 and column 2.

Robert Krzyzanowski
- 9,294
- 28
- 24
-
Thanks for your responses but after taking out the min and the max the outlier that only seems to appear in the plot3D is still there. I've been guided to create an artificially constructed factor which I'm guessing to group somehow but I don't know what to group on. I tried putting row number in a column and color coding by this but all I get is a light green outlier which I still can't be sure which row it has come from. – user3479729 Mar 31 '14 at 06:14
0
Welcome user3479729. Please post a reproducible example. Otherwise you will either get no answer or bad ones.
if 'M' is the matrix that you plot and 'thres' is your threshold for outlier data (I need here to assume that you are plotting a matrix?), you can use:
> which(M>thres,arr.ind=TRUE)

Community
- 1
- 1

RockScience
- 17,932
- 26
- 89
- 125