plot density of multiple csv files of different size in R

Question

I have multiple csv files, each with a single column. I want to read them and plot their density distribution in a single plot.

can anyone help me?

does the column in each csv all have the same name or different names across different? Do all the csvs have the same dimension or not? — zimia, Feb 13 '21 at 15:37
row_bind the csv files together, using an id variable to identify the densities/files. Then use the id variable as an aesthetic to identify the density in whatever way you want (eg by linetype, colour or fill). With no sample data or code it’s difficult to help you in more detail. — Limey, Feb 13 '21 at 16:45

G5W · Accepted Answer · 2021-02-13T19:16:20.057

There are answers elsewhere about reading multiple csv files so I will mainly concentrate on the density plotting part. Since you did not provide any data, I will use the built-in iris data to create some example files. This first step is to make a reusable example. I am assuming that you already have the data on the disk and have a list of the file names.

## Create some test data
FileNames = paste(names(iris[,1:4]), ".csv", sep="")
for(i in 1:4) {
    write.csv(iris[,i], FileNames[i], row.names=FALSE) 
}

So, on to the density plots. There is one small sticky point. Each of the different density plots will cover a different range of x and y values. If you want them all in one plot, you will need to leave enough room in your plot to hold them all. The code below first computes that range, then makes the plots.

## Read in all of the data from csv
DataList = list()
for(i in seq_along(FileNames)) {
    DataList[[i]] = read.csv(FileNames[i], header=T)[[1]]
}

## Find the range that we will need to include all plots
XRange = range(DataList[[1]])
YRange = c(0,0)
for(i in seq_along(DataList)) {
    Rx = range(DataList[[i]])
    XRange[1] = min(XRange[1], Rx[1])
    XRange[2] = max(XRange[2], Rx[2])
    YRange[2] = max(density(DataList[[i]], na.rm=T)$y, YRange[2])
}

## Now make all of the plots
plot(density(DataList[[1]], na.rm=T), xlim=XRange, ylim=YRange, 
    xlab=NA, ylab=NA, main="Density Plots")
for(i in seq_along(DataList)) {
    lines(density(DataList[[i]], na.rm=T), col=i)
}
legend("topright", legend=FileNames, lty=1, col=1:4, bty='n')

plot density of multiple csv files of different size in R

1 Answers1