1

I took a random sample from a data frame. But I don't know how to get the remaining data frame.

df <- data.frame(x=rep(1:3,each=2),y=6:1,z=letters[1:6])

#select 3 random rows
df[sample(nrow(df),3)]

What I want is to get the remaining data frame with the other 3 rows.

Elif Y
  • 251
  • 1
  • 4
  • 21
  • possible duplicate of [Random rows in dataframe in R](http://stackoverflow.com/questions/8273313/random-rows-in-dataframe-in-r) – KFB Nov 12 '14 at 07:43

2 Answers2

6

sample sets a random seed each time you run it, thus if you want to reproduce its results you will either need to set.seed or save its results in a variable.

Addressing your question, you simply need to add - before your index in order to get the rest of the data set. Also, don't forget to add a comma after the indx if you want to select rows (unlike in your question)

set.seed(1)
indx <- sample(nrow(df), 3)

Your subset

df[indx, ] 
#   x y z
# 2 1 5 b
# 6 3 1 f
# 3 2 4 c

Remaining data set

df[-indx, ]
#   x y z
# 1 1 6 a
# 4 2 3 d
# 5 3 2 e
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
2

Try:

> df
  x y z
1 1 6 a
2 1 5 b
3 2 4 c
4 2 3 d
5 3 2 e
6 3 1 f
> 
> df2 = df[sample(nrow(df),3),]
> df2
  x y z
5 3 2 e
3 2 4 c
1 1 6 a

> df[!rownames(df) %in% rownames(df2),]
  x y z
1 1 6 a
2 1 5 b
5 3 2 e
rnso
  • 23,686
  • 25
  • 112
  • 234