0

I am working with a large data frame, which has 'sex' as one of the columns. It lookssomething like this

OriginalDF

SEX X1 X2 X3 X4
0   15 91 12 25
1   12 92 32 35
1   14 94 12 45
0   12 91 42 15
0   11 95 12 25

I would like to be able to transform it into two separate dataframes, based on the binary sex variable;

DF1

SEX X1 X2 X3 X4
0   15 91 12 25
0   12 91 42 15
0   11 95 12 25

DF2

SEX X1 X2 X3 X4
1   12 92 32 35
1   14 94 12 45

How can I accomplish this efficiently?

Thanks in advance!

Sharma Ji
  • 15
  • 4
  • You can use `dplyr` too to accomplish it: `df %>% split(.$sex)`. It will return you a list with both `data.frames` – patL May 07 '18 at 09:31

1 Answers1

0
# Using data frames
DF1 <- OriginalDF[OriginalDF$SEX == 0, ]
DF2 <- OriginalDF[OriginalDF$SEX == 1, ]

# If it's very large, I recommend you data.table
library(data.table)
OriginalDT <- data.table(OriginalDF)
DT1 <- OriginalDT[SEX == 0]
DT2 <- OriginalDT[SEX == 1]
mbh86
  • 6,078
  • 3
  • 18
  • 31
  • Perfect! Thanks a lot, I was putting the comma in the wrong place... – Sharma Ji May 07 '18 at 09:24
  • Just added data.table in edit if your data is very large. And actually it's better and easier to use it :-) – mbh86 May 07 '18 at 09:25
  • This does not generalize. If you want to split by a factor with n levels (rather than just two) then It is not possible to do it manually as you show in your answer – Sotos May 07 '18 at 09:30
  • Yes I agree it's possible to generalize to n factors, that's not what it's asked. Dichotomous : dividing into 2 branches/factors, right? – mbh86 May 07 '18 at 10:04