I am currently working with a big biological dataset with many datapoint. The Head() function in R gives me the following column names:
intensity - Sample - Acession - Study - Dx
Intensity is the only data that is numeric. The others are character.
First, I have unfactorized all data into the following df: unfactordata. Next, I am interested in making a scatterplot of a specific subset of data which I do with the following piece of code where after I try to scatterplot it with a geom_smooth line in between. I use the following code:
scatplotprot <- function(name){
proteinname <- subset(unfactordata, Acession == name)
p <- ggplot(data = proteinname, aes(x = Dx, y = intensity, color = Study)) +
geom_point() +
geom_smooth(method = 'lm', aes(group = Dx))
return(p)
}
This does gives me a scatterplot with all the intensity values between 2 groups (Dx), as well as being coloured depending on which Study the datapoint originates from. However, it will not show me a line between the two groups (Dx). Depending on which Acession I call I expect to see between 3 to 8 lines.
Hope anyone can help me clear this hopefully small problem.
Warmest,
Patrick