1

I think I missed something in the use of the loess function and I can't understand what i did wrong. I have a data frame in which I store the output (count) of 3 different softwares for 26 different genes on the genomes of different patients. The 3 softwares were each used on the same genome but with different rate of downsampling.

I pooled the results of all the patients by genes. At the end I have a data frame with 4 columns: samplexxx (downsampling rate), software (name of the software I used), gene (the name of the gene) and count (count results given by the software).

My goal is to estimate the downsampling effect (samplexxx) on the count given by the software, and I want to do some regression to be able to compare them with each other.

rate <- c(5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 
          95, 100)

my attempts:

datalist <- list()   

for (i in 1:22) {
  name <- genes[i]
  print(name)
  mod <- paste("mod_", name)
  xfit <- paste("xfit_", name)
  df <- paste("df_", name)
  mod <- loess(data2[data2$gene == name,]$count ~ 
                 data2[data2$gene == name,]$samplexxx)
  xfit <- predict(mod, newdata=data2[data2$gene == name,]$samplexxx)
  df <- setNames(data.frame(matrix(ncol=4, nrow=60)), 
                 c("down", "software", "gene", "loess"))
  df$down <- data2[data2$gene == name,]$samplexxx
  df$software <- data2[data2$gene == name,]$software 
  df$gene <- data2[data2$gene == name,]$gene    
  df$loess <- xfit
  print(xfit)                        
  datalist[[i]] <- df                      
}

data_loess <- do.call(rbind, datalist)

ggplot(data_loess, aes(x=gene, y=loess, fill=software)) + 
  geom_boxplot()  

and:

mod <- loess(data2$count ~ data$samplexxx)
xfit <- predict(mod, newdata=data2$samplexxx)

                 
for (i in 1:20) {
  down <- rate[i]
  print(name)
  title <- paste("loess_downsampling", down)
  out <- paste("loess_downsampling", down, ".pdf", sep="")
  pdf(out, width=10)
  print(ggplot(data2, aes(x=down, y=loess, fill=software))) + 
    geom_boxplot() + ggtitle(title))
dev.off()
} 

Sample data:

> dput(data2)
structure(list(samplexxx = c(5L, 10L, 15L, 20L, 25L, 30L, 35L, 
40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 
70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 
35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 
100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 
65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 
30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 
95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 
60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 
25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 
90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 
55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 
20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 
85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 
50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 
15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 
80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 
45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 
70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 
35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 
100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 
65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 
30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 
95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 
60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 
25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 
90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 
55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 
20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 
85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 
50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 
15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 
80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 
45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 
70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 
35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 
100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 
65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 
30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 
95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 
60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 
25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 
90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 
55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 
20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 
85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 
50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 
15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 
80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 
45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 
70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 
35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 
100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 
65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 
30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 
95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 
60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 
25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 
90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 
55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 
20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 
85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 
50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 
15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 
80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 
45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 
70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 
35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 
100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 
65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 
30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 
95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 
60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 
25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 
90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 
55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 
20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 
85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 
50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 
15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 
80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 
45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 
70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 
35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 
100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 
65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 
30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 
95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 
60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 
25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 
90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 
55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 15L, 
20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 
85L, 90L, 95L, 100L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 
50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 5L, 10L, 
15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L, 55L, 60L, 65L, 70L, 75L, 
80L, 85L, 90L, 95L, 100L), software = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("EH", "GangSTR", "Tred"), class = "factor"), 
    gene = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
    7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
    10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 
    12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
    13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
    17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 
    17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 
    18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 
    18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 
    19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 21L, 21L, 21L, 21L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    21L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
    7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
    10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
    13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L, 
    17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 
    17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 
    18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 
    19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 
    19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 4L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 
    7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
    8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 
    10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
    10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 
    13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
    13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 
    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
    16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 
    17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 
    18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 
    18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 
    19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 
    19L, 19L, 19L, 19L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, 22L), .Label = c("AFF2", "AR", "ATN1", "ATXN1", 
    "ATXN10", "ATXN2", "ATXN3", "ATXN7", "C9ORF72", "CACNA1A", 
    "CBL", "CNBP", "CSTB", "DIP2B", "DMPK", "FMR1", "FXN", "HTT", 
    "JPH3", "NOP56", "PPP2R2B", "TBP"), class = "factor"), count = c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, 24L, 24L, 24L, 24L, 24L, 
    24L, 24L, 24L, 24L, 24L, 24L, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    21L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 17L, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 15L, 15L, 
    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
    16L, NA, NA, NA, NA, 20L, 34L, 31L, 33L, 34L, 34L, 34L, 34L, 
    34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, NA, NA, NA, NA, NA, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, NA, NA, NA, NA, NA, 22L, 24L, 24L, 24L, 24L, 
    24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, NA, NA, 
    NA, NA, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, NA, NA, NA, NA, 6L, 8L, 8L, 
    8L, 8L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, NA, NA, 
    NA, NA, 11L, NA, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, NA, NA, NA, 12L, 5L, NA, 12L, 
    12L, 5L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    12L, NA, NA, NA, NA, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, 20L, 20L, 18L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, NA, NA, NA, NA, 27L, 24L, 
    21L, 14L, 27L, 14L, 21L, 27L, 27L, 14L, 27L, 27L, 27L, 27L, 
    27L, 27L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 68L, 73L, 
    78L, 54L, 79L, 76L, 87L, 72L, 62L, 63L, NA, NA, NA, NA, NA, 
    27L, 27L, 27L, 28L, 27L, 27L, 64L, 27L, 64L, 64L, 27L, 27L, 
    27L, 27L, 27L, NA, NA, NA, NA, NA, 18L, 20L, 18L, 20L, 20L, 
    18L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, NA, NA, 
    NA, NA, NA, 15L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, 9L, 7L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, NA, NA, NA, NA, NA, 14L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    14L, 14L, NA, NA, NA, NA, NA, 35L, 29L, 35L, 35L, 30L, 35L, 
    32L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 11L, 19L, 19L, 
    19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 
    19L, 19L, 19L, 19L, 19L, 20L, 11L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    20L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, 16L, 16L, 16L, 16L, 16L, 16L, 
    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
    16L, 16L, 33L, 33L, 32L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 
    33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, NA, 21L, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, 22L, 22L, 22L, 19L, 21L, 21L, 21L, 21L, 21L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    21L, 19L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 8L, 8L, 
    7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 11L, NA, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, 7L, 15L, 15L, 13L, 15L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, 27L, 19L, 27L, 27L, 27L, 
    27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 
    27L, 27L, NA, 76L, 23L, 23L, 23L, 32L, 65L, 32L, 28L, 32L, 
    28L, 32L, 32L, 23L, 28L, 32L, 28L, 28L, 32L, 84L, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, 14L, 18L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 
    17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 15L, 
    NA, NA, 15L, NA, 15L, NA, NA, 15L, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, 9L, NA, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
    9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, NA, 28L, 36L, 36L, NA, 36L, 36L, 36L, 
    36L, NA, 36L, NA, 36L, 36L, 36L, 36L, 36L, NA, 36L, 36L, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    1L, 8L, 18L, 16L, 15L, 14L, 15L, 16L, 15L, 16L, 14L, 15L, 
    14L, 14L, 14L, 14L, 16L, 16L, 16L, 16L, 31L, 28L, 31L, 31L, 
    32L, 32L, 32L, 33L, 31L, 33L, 32L, 31L, 32L, 32L, 32L, 32L, 
    32L, 32L, 32L, 32L, 7L, 18L, 22L, 22L, 22L, 22L, 22L, 22L, 
    22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 
    19L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    11L, 11L, 11L, 11L, 11L, 5L, 6L, 6L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 12L, 11L, 12L, 
    12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    12L, 12L, 12L, 12L, 12L, 5L, 7L, 7L, 7L, 7L, 11L, 11L, 7L, 
    11L, 15L, 15L, 11L, 7L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
    1L, 2L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 4L, 20L, 17L, 7L, 7L, 7L, 7L, 7L, 7L, 
    7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 1L, 2L, 1L, 1L, 
    1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 1L, 15L, 6L, 22L, 13L, 14L, 13L, 14L, 13L, 14L, 14L, 
    27L, 27L, 14L, 14L, 27L, 14L, 27L, 14L, 27L, NA, 15L, 20L, 
    20L, 20L, 20L, 40L, 20L, 40L, 20L, 40L, 40L, 40L, 40L, 20L, 
    40L, 40L, 40L, 40L, 32L, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 15L, 14L, 
    17L, 17L, 17L, 19L, 17L, 13L, 17L, 17L, 17L, 17L, 17L, 17L, 
    17L, 17L, 17L, 17L, 17L, 17L, 5L, 3L, 1L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 5L, 3L, 
    1L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 12L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
    14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, NA, 
    2L, 3L, 2L, 29L, 33L, 33L, 35L, 33L, 35L, 35L, 33L, 35L, 
    35L, 33L, 35L, 35L, 35L, 35L, 35L)), class = "data.frame", row.names = c(NA, 
-1320L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • What's the question? I appreciate that you've shared data reproducibly and some code, but *"my goal is to estimate the downsampling effect on the count gave by the software and i want to do some regression to be able to compare themselves"* isn't super clear. What is unsatisfactory about your attempts? Are there errors or warning messages? If so, what are they? If not, are you getting results you don't like? What is wrong with them? Or is the problem in the plotting? – Gregor Thomas Dec 22 '21 at 15:26
  • Also, the sample data you've provided is so large that your question is butting up against the size limits for questions on Stack Overflow. Is **all** your data necessary to illustrate whatever your problem is, or could you find a smaller sample that will be easier to work with and easier to understand? – Gregor Thomas Dec 22 '21 at 15:30
  • 1
    yes of course you can use only some gene to reduce the amount of data, but there is a minimum of needed data. my question is : is it normal that is have the same value for every rate of downsampling? i want to do some regression but if i have the same results for every rate i'm afraid something wrong – Marine Bergot Dec 22 '21 at 16:37

1 Answers1

1

I believe the loess should be done on a split on the "software".

software <- unique(data2$software)

data_loess <- do.call(rbind, lapply(software, \(x) {
  X <- subset(data2, software == x)
  lo <- loess(count ~ samplexxx, X)
  count_pred <- predict(lo, newdata=X)
  return(cbind(X, count_pred))
}))

Note: R version 4.1.2 (2021-11-01)

Gives:

head(data_loess[data_loess$samplexxx > 80, ], 10)
#    samplexxx software gene count count_pred
# 17        85       EH AFF2    24   22.69004
# 18        90       EH AFF2    24   22.31879
# 19        95       EH AFF2    24   21.83428
# 20       100       EH AFF2    24   21.25618
# 37        85       EH   AR    21   22.69004
# 38        90       EH   AR    21   22.31879
# 39        95       EH   AR    21   21.83428
# 40       100       EH   AR    21   21.25618
# 57        85       EH ATN1    NA   22.69004
# 58        90       EH ATN1    NA   22.31879

And here a plot of "count" predictions on "samplexxx".

plot(count_pred ~ samplexxx, data_loess, col=as.numeric(software) + 1, 
     pch=20, xlab='Downsampling', ylab='Count (LOESS)')
legend('topleft', legend=software, pch=19, col=as.numeric(software) + 1,
       horiz=TRUE, cex=.7, title='Software')

enter image description here

Looks interesting, but I'm not sure if it's absolutely right.

In my answer you see something different from for loops, which is probably new to you, however it's the r-ish way and its much shorter to code. The looping job here does lapply().

Anyway, hope this helps.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thanks a lot for your answer! seems it's exactly what i want :) i just have some issue with your code, i'm unfortunatly more python than R (as you noticed with my for loop) and i don't know the do.call() function then it's hard for me to fix the problem. with your code i have some error : Erreur : unexpected input in "data_loess <- do.call(rbind, lapply(software, \" ; after that i have of course error for every lines :/ – Marine Bergot Dec 23 '21 at 18:44
  • @MarineBergot It's rather the ``\`` than the `do.call`. The problem is highly probably that you have an old R version, update it! Alternatively replace ``\`` by `function`. Cheers – jay.sf Dec 23 '21 at 18:56
  • thanks for your answer :) i tried on my R version 4.1.0 and it worked ! just the "software" is not working, i had to replaced it by data2$software, but maybe you did something smarter? – Marine Bergot Dec 23 '21 at 19:39
  • @MarineBergot Almost, it's the `unique(data2$software)`, forgot to paste it, see update and thanks for noting! – jay.sf Dec 23 '21 at 19:48
  • @MarineBergot Short annex: [there](https://stackoverflow.com/a/10801883/6574038) are some nice answers concerning `do.call` and also `lapply`! – jay.sf Dec 23 '21 at 20:01