I need to shuffle the rows of a data frame, turning this:
A foo
B bar
C baz
into this:
B foo
C bar
A baz
I.e., the first column should be shuffled while keeping the the rest intact. I am doing this using sample()
from the kimisc
library as suggested here. A minimal working code example is:
>df<-read.table("file1", header=F, skip=1)
>library(kimisc)
>names<-read.table("file2")
>df1<- transform(sample(df,size=nrow(names)),V1=names)
>df1
V1 V2
5 A 21266
8 C 22109
7 F 17971
1 J 11137
Where file1
is
Name Value
A 28463
B 11137
C 24966
D 24611
E 14980
F 21266
G 23441
H 17971
I 22109
J 31746
and file2
is:
A
C
F
J
I then want to write this data frame to a file and my expected output is
A 21266
C 22109
F 17971
J 11137
However, loading the kimisc
library provides its own sample
function which (unlike the vanilla) shuffles a data frame the way I want it to but seems to screw up the printing:
write.table(df1,"file3", quote=F, sep='\t', col=FALSE)
This produces the following output:
5 1:4 21266
8 1:4 22109
7 1:4 17971
1 1:4 11137
If I use the vanilla sample
, the data frame generated is printed as expected but it is not shuffled in the way I need (ie, columns instead of rows are shuffled).
So, how can I use sample
from kimisc
which allows me to sample the rows and not the columns of a data frame, and still print it in the way write.table
would work with a data frame returned by base::sample
?
PS.I am using a list of names because I am actually trying to assign random values from a file containing 143558041 lines to a subset (39953) of the names mentioned in that file.
As requested, the output of dput(df1)
is
> dput(df1)
structure(list(V1 = structure(list(V1 = structure(1:4, .Label = c("A",
"C", "F", "J"), class = "factor")), .Names = "V1", class = "data.frame", row.names = c(NA,
-4L)), V2 = c(24611L, 14980L, 22109L, 21266L)), .Names = c("V1",
"V2"), row.names = c(3L, 4L, 8L, 5L), class = "data.frame")