How to split one-column data.frame and get data.frame as output?

Question

I used the code from this answer to split my train data into two sets.

trainLabels <- read.csv(trainLabels.file, stringsAsFactors=F, header=FALSE)

> str(trainLabels)
'data.frame':   1000 obs. of  1 variable:
 $ V1: int  1 0 0 1 0 1 0 1 1 0 ...

trainLabelsTrain <- trainLabels[train_ind, ]
trainLabelsTest <- trainLabels[-train_ind, ]

> str(trainLabelsTrain)
 int [1:750] 0 1 0 0 0 0 1 1 1 0 ...

Then I would like to have a data.frame just like the original data (trainLabels).

How can I get a data.frame?

Put drop=FALSE in your subsetting lines. – Thomas Oct 22 '13 at 14:42 — Thomas, Oct 22 '13 at 14:42

score 3 · Accepted Answer · answered Oct 22 '13 at 14:42

use the drop = FALSE command in your subsetting...

# drop = TRUE by default in `[` subsetting...
df <- data.frame( a = 1:10 )
df[ c(1,3,5) , ]
#[1] 1 3 5

#  With drop = FALSE...
df[ c(1,3,5) , , drop = FALSE ]
#  a
#1 1
#3 3
#5 5

When drop = TRUE R will attempt to coerce the result to the lowest possible dimension, in this case an atomic vector, as there is only a single column.

score 0 · Answer 2 · answered Oct 22 '13 at 16:27

Obviously I like @SimonO101's answer, but I just thought I'd add that one could also use the split function here:

df <- data.frame(a = 1:10)
set.seed(1)
x <- rbinom(10,1,.5)
out <- split(df,x)

The result would be a list of two dataframes:

> str(out)
List of 2
 $ 0:'data.frame':      4 obs. of  1 variable:
  ..$ a: int [1:4] 1 2 5 10
 $ 1:'data.frame':      6 obs. of  1 variable:
  ..$ a: int [1:6] 3 4 6 7 8 9

This is because drop=TRUE is the default in [ but drop=FALSE is the default in split.

How to split one-column data.frame and get data.frame as output?

2 Answers2