0

I have an ASP.NET Core app, with a model, the aim is to allow user to upload an excel file and then save the file to the model/table. I have the below method

[HttpPost]
[ValidateAntiForgeryToken]
public async Task<IActionResult> Upload(IFormFile file)
{
    string webRootPath = _hostEnvironment.WebRootPath;
    var uploads = Path.Combine(webRootPath, "Upload");
    var files = HttpContext.Request.Form.Files;
    var extension = Path.GetExtension(files[0].FileName);

    using (var filesStream = new FileStream(Path.Combine(uploads, file.FileName), FileMode.Create))
    { 
        files[0].CopyTo(filesStream);
    }

    var list = new List<User>();

    using (var stream = new MemoryStream())
    {
        await file.CopyToAsync(stream);

        using (var package = new ExcelPackage(stream))
        {
            ExcelWorksheet worksheet = package.Workbook.Worksheets[0];  
            var rowcount = worksheet.Dimension.Rows;  

            for (int row = 2; row <= rowcount; row++)
            {
                list.Add(new User
                             {
                                 Name = worksheet.Cells[row, 1]?.Value?.ToString().Trim(),
                                 Address1 = worksheet.Cells[row, 2]?.Value?.ToString().Trim(),
                                 PostCode = worksheet.Cells[row, 3]?.Value?.ToString().Trim(),
                                 Mobile = worksheet.Cells[row, 4]?.Value?.ToString().Trim(),
                             });
            }
        }
    }

    foreach (var user in list)
    {
         _db.User.AddAsyncy(user);
    }

    _db.SaveChangesAsyncy();

    return View();
}

This code works fine by processing an excel file uploaded by a user but the problem I'm having is that when the file is large say above 3 mb, it takes well over 8 minutes to upload.

Any idea how to speed this up please? Thanks.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Jeff
  • 115
  • 7
  • One point to improve your performance is that you can make the method async and try add `await` to async method when you call it. – Rena Jun 22 '22 at 07:33
  • Thanks for your comment, but the method is already an async as u can see - it's just really slow for large file the app even crash handling just 50k rows of Excel records! – Jeff Jun 22 '22 at 13:58
  • Can't you just use _db.User.AddRangeAsync(list) and with a cancellation token. Also make sure it is not your DB causing the issue. You might need to increase the amount of memory it can use, also add indexing to the table. EF performance is terrible when you get into 10k's of records, so you may want to use ADO instead. However, take look at this https://stackoverflow.com/questions/59954097/ef-core-slow-bulk-insert-80k-rows. Also, you may need to adjust the timeout for the context in that method, I tihnk the default is 30 seconds. – The OrangeGoblin Jul 13 '22 at 21:27

1 Answers1

0

There are two things you can do to increase speed.

1)Instead of reading excel file with ExcelWorksheet class go with a library called ExcelDataReader which can read around 600k records under a minute.

sample code

Model
class Person
{
public int id,
public string name
}

//and excel file has both columns in model ,the we can read with below code
using ExcelDataReader;

System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);

var fileName = "./Person.xlsx";

var timer = new Stopwatch();
timer.Start();
int counter=0;

List<Person> persons = new List<Person>();

using (var stream = System.IO.File.Open(fileName, FileMode.Open, FileAccess.Read))
{
    using (var reader = ExcelReaderFactory.CreateReader(stream))
    {
        while (reader.Read()) //Each row of the file
        {
            var person = new Person
            {
                id = reader.GetValue(0).ToString(),
                name = reader.GetValue(1).ToString()
            }
            persons.Add(person)
            counter++;

         }
         timer.Stop();
        duration = timer.ElapsedMilliseconds / 1000;
        //to check performace print duration and persons list
     }
}


https://github.com/ExcelDataReader/ExcelDataReader

2)Once you read and store data in a list, you can store that data in DataTable class and insert into database using Oracle.ManagedDataAccess.Client Nuget package instead of EFcore. This method is fast. Please go through below link for doing this with Oracle database.

https://www.c-sharpcorner.com/article/two-ways-to-insert-bulk-data-into-oracle-database-using-c-sharp/

var db_timer = new Stopwatch();
db_timer.Start();


DataTable dt = new DataTable();
dt.Columns.Add("id");
dt.Columns.Add("name");

for (int i = 0; i < counter; i++)
{
    DataRow dr = dt.NewRow();
    dr["id"] = persons[i].id;
    dr["name"] = persons[i].name;
    dt.Rows.Add(dr);
}

using (var connection = new OracleConnection(oracleConString))
{

    connection.Open();
    int[] ids = new int[dt.Rows.Count];
    string[] names = new string[dt.Rows.Count];
    
    for (int j = 0; j < dt.Rows.Count; j++)
    {
        ids[j] = Convert.ToString(dt.Rows[j]["id"]);
        names[j] = Convert.ToString(dt.Rows[j]["name"]);
    }

    OracleParameter id = new OracleParameter();
    id.OracleDbType = OracleDbType.Int32;
    id.Value = ids;

    OracleParameter name = new OracleParameter();
    name.OracleDbType = OracleDbType.Varchar2;
    name.Value = names;

   OracleCommand cmd = connection.CreateCommand();
    cmd.CommandText = "INSERT INTO TEST(id,name) VALUES (:1,:2)"; 
    cmd.ArrayBindCount = ids.Length;
    cmd.Parameters.Add(id);
    cmd.Parameters.Add(name);

    cmd.ExecuteNonQuery();
}

just sample code you can user timer to check how much time it is taking to execute.