1

In my previous post, I was able to upload multiple files in Shiny, process the files, rbind the results and return a csv file download, using a for loop. Thanks for the contribution of @SBista. However, because I have to upload a lot of files at a time (total size of about 50 - 100mb), I found running the shiny app to be very slow perhaps due to the for loop. I know that lapply() is faster in reading multiple csv files than for loop, but applying lapply() in my code gives an error (ERROR: Invalid 'description' argument ) after running the app. Any help will be appreciated. This is my dummy file, and this is my code:

 library(shiny)

 ui <- fluidPage(
   fluidPage(
     titlePanel("MY CSV FILES MERGER"),
     sidebarLayout(
       sidebarPanel(
         fileInput("file1",
              "Choose CSV files from a directory",
              multiple = TRUE,
              accept=c('text/csv', 
                       'text/comma-separated-values,text/plain', 
                       '.csv')),
         downloadButton('downloadData', 'Download')
       ),
       mainPanel(
         tableOutput('contents')
       )
     )
   )
 )

 library(shiny)
 library(dplyr)
 options(shiny.maxRequestSize = 100*1024^2)
 server <-  function(input, output) {
   getData <- reactive({
     inFile <- input$file1
     if (is.null(inFile)){
       return(NULL)
     }else {   
      files3 = lapply(inFile, function(y){
        JSON_csv = read.csv(y, header = TRUE)
        lastrow = nrow(JSON_csv)
        shift = function(x, n){
          c(x[-(seq(n))], rep(NA, n))
        }
        JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
        JSON_csv = JSON_csv[-lastrow, ]
        JSON_csv 
      }

                 )
       do.call(rbind, files3)
     }
   })
   output$contents <- renderTable( 
     getData() 
   )
   output$downloadData <- downloadHandler(
     filename = function() { 
       paste("data-", Sys.time(), ".csv", sep="")
     },
     content = function(file) { 
       write.csv(getData(), file, row.names=FALSE)   
     })
 }

 shinyApp(ui = ui, server = server)

With for loop, this code works but it is very very slow while working with multiple csv files of 50-100mb:

 library(shiny)
 library(dplyr)
 server <-  function(input, output) {
 getData <- reactive({
  inFile <- input$file1
  if (is.null(inFile)){
    return(NULL)
  }else {
    # browser()
    numfiles = nrow(inFile) 
    kata_csv1 = list()


    for (i in 1:numfiles)
    {

      JSON_csv = read.csv(input$file1[[i, 'datapath']], header = TRUE)
      lastrow = nrow(JSON_csv)
      shift = function(x, n){
        c(x[-(seq(n))], rep(NA, n))
      }
      JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
      kata_csv1[[i]] = JSON_csv[-lastrow, ]

    }
    # browser()
    do.call(rbind, kata_csv1)
     }
   })
  output$contents <- renderTable( 
  getData() 
  )
  output$downloadData <- downloadHandler(
  filename = function() { 
    paste("data-", Sys.Date(), ".csv", sep="")
  },
  content = function(file) { 
    write.csv(getData(), file, row.names=FALSE)   
  })
  }

 shinyApp(ui = ui, server = server)
William
  • 340
  • 7
  • 17

2 Answers2

1

Here's a possible solution for files without using for loop:

 library(readxl)
    file.list <- list.files(pattern='*.xlsx')
    df.list <- lapply(file.list, read_excel)

Shiny. I run all my files in a separate script and reference the script using 'source' in the shiny app

source("SCRIPTGEO.R", local = TRUE) 

Here's a link on pulling multiple files. Reading multiple files into R - Best practice

Here's what I did on my app in the Script. I'm no expert so there may be other ways...

    fils1 <- list.files(pattern = ".csv")


allquotes <- function(fils1){
  dfs <- lapply(fils1, function(x){
  df <- read.csv(x, header = T, stringsAsFactors = FALSE)
  df <- df[-c(1,nrow(df)),]
  df <- df[,c(1,2,3,5,6,7,8)]
  colnames(df) <-  c("ID", "ENTRY_DATE", "QUOTEORORDER","BILL.TO", "NAME", "BRANCH", "CONVERTED")
  return(df)
  })
  testbind <- do.call("rbind", dfs)
  return(testbind)
}
astronomerforfun
  • 349
  • 3
  • 14
  • Hi @astronomerforfun. Thanks. I want to use shiny to upload the files, and perform some operation on each file using this portion of the code ` lastrow = nrow(csvfile) shift = function(x, n){ c(x[-(seq(n))], rep(NA, n)) } csvfile$companyID1 = shift(csvfilev$companyID1, 1) kata_csv1[[i]] = csvfile[-lastrow, ]`. Thereafter, I will now rebind the file. Any help? I was able to that using `for` loop – William Oct 22 '17 at 11:26
  • I'm not exactly sure what you have going on. But I put a copy of my code in original post. Basically use lapply and then write a function to do the necessary adjustments. I did have problems with this for a while myself. – astronomerforfun Oct 22 '17 at 17:13
  • Hi @astronomerforfun. Thanks for your contribution. I fixed the problem. It is much faster to read the files in a separate R script and reference the script using 'source' in the shiny app. I tested this two options and found that reading and performing some processes on 10 files of about 90mb in a separate R script and referencing the script using 'source' in the shiny app takes about 3 minutes on my machine, but running the whole code within server.R of my app took hours. – William Oct 23 '17 at 21:10
  • Nice! Glad I could help. Exactly what I do on mine. I've created a few Shiny Apps. – astronomerforfun Oct 24 '17 at 22:34
1

the problem is when you are passing inFile to lapply you are actually passing only the first column containing the filename. Instead you'll need to passinFile$datapath. The lapply should be like this:

   files3 = lapply(inFile$datapath, function(y){

     JSON_csv = read.csv(y, header = TRUE)
     lastrow = nrow(JSON_csv)
     shift = function(x, n){
       c(x[-(seq(n))], rep(NA, n))
     }
     JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
     JSON_csv = JSON_csv[-lastrow, ]
     JSON_csv 
   }

Hope it helps!

SBista
  • 7,479
  • 1
  • 27
  • 58
  • Hi @Bista, many thanks for your contribution. However, when I uploaded about 2 files, the app ran well, but when I uploaded about 10 files (about 50mb), then I got this error: `Warning in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : EOF within quoted string`. – William Oct 22 '17 at 13:33
  • Maybe your issue is related to [this](https://stackoverflow.com/questions/17414776/read-csv-warning-eof-within-quoted-string-prevents-complete-reading-of-file)? – SBista Oct 22 '17 at 13:36
  • Hi @Bista, I found solution to the `error: Warning in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : EOF within quoted string.`. These arguments: `quote = "", fill = FALSE` has to be included in the `csv()`. – William Oct 23 '17 at 08:27