`rbind` unique entries of all columns of a data frame and write it to a csv file

Question

##Initialise empty dataframe
g <-data.frame(x= character(), y= character(),z=numeric())

## Loop through each columns and list out unique values (with the column name)
for(i in 1:ncol(iris))
{
a<-data.frame(colnames(iris)[i],unique(iris[,i]),i)
g<-rbind(g,a)
setNames(g,c('x','y','z'))
}
## write the output to csv file
write.csv(g,"1.csv")

The output CSV file is something like this

Now the Column headers I want are not proper. I want column headers to be 'x','y','z' respectively. Also the first column should not be there.

Also if you have any other efficient way to do this, let me know. Thanks!

You can change the column names like this http://stackoverflow.com/a/6081514/3519000, you can skip the row names like this http://stackoverflow.com/questions/7484325/prevent-row-names-to-be-written-to-file-when-using-write-csv. and as for the efficiency of your loop, I would suggest you again to look around in StackOverflow first. Thanks — lrnzcig, Aug 22 '16 at 16:05

Zheyuan Li · Accepted Answer · 2016-08-22T18:00:48.863

This will do the work:

for(i in 1:ncol(iris))
{
a<-data.frame(colnames(iris)[i],unique(iris[,i]),i)
g<-rbind(g,a)
}
g <- setNames(g,c('x','y','z'))   ## note the `g <-`
write.csv(g, file="1.csv", row.names = FALSE)   ## don't write row names

setNames returns a new data frame with names "x", "y" and "z", rather than updating the input data frame g. You need the explicit assignment <- to do the "replacement". You may hide such <- by using either of the two

names(g) <- c('x','y','z')
colnames(g) <- c('x','y','z')

Alternatively, you can use the col.names argument inside write.table:

for(i in 1:ncol(iris))
{
a<-data.frame(colnames(iris)[i],unique(iris[,i]),i)
g<-rbind(g,a)
}
write.table(g, file="a.csv", col.names=c("x","y","z"), sep =",", row.names=FALSE)

write.csv() does not support col.names, hence we use write.table(..., sep = ","). Trying to use col.names in write.csv will generate a warning.

A more efficient way

I would avoid using rbind inside a loop. I would do:

x <- lapply(iris, function (column) as.character(unique(column)))
g <- cbind.data.frame(stack(x), rep.int(1:ncol(iris), lengths(x)))
write.table(g, file="1.csv", row.names=FALSE, col.names=c("x","y","z"), sep=",")

Read ?lapply and ?stack for more.

I would like to have one more column in data frame (g)/csv file (say labels). It would repeat from 1 to n based on number of unique entries for a column. Example: For sepal.length it would start from 1 to 35...then for sepal.width it would again start from 1 upto 22. — Abhinav S, Aug 24 '16 at 08:25

`rbind` unique entries of all columns of a data frame and write it to a csv file

1 Answers1