0

i've just got a routine working that uploads a CSV file from a client Internet Browser to a server directory. Then the file is read into a String array with :

public HttpResponseMessage Upload()
{
    HttpPostedFile file = HttpContext.Current.Request.Files[0];
    var path = Path.Combine(
        HttpContext.Current.Server.MapPath("~/App_Data/uploads"), 
        file.FileName   );

    string[] Lines = System.IO.File.ReadAllLines(path);
    ReadCsv(Lines,file.FileName ); //......ETCETC

This all works great, however my issue is the speed. I know the problem I have is that I am looping through each record in the Array and inserting into my Model. My question is, how can I perform this faster. I have say 19,000 rows and its a 90 minute process. The file upload takes 10 seconds for 20MB. All fine, I need to alter this procedure called UploadCSV() to perhaps copy the records in say a LINQ SQL Statement, that would mean no LOOPING. Does anyone have any ideas how to do this? Here is my 'slow procedure' thats looping each record. I need to go faster from the Array to the Database, perhaps in one step:

public void UploadCSV(String[] Lines, string filename)
{
    string[] Fields;
    Fields = Lines[0].Split(new char[] { '|' });
    int Cols = Fields.GetLength(0);

    for (int i = 1; i < Lines.GetLength(0); i++)
    {
        Fields = Lines[i].Split(new char[] { '|' });

        Water water = new Water()
        {
            SupplyPointId = decimal.Parse(Fields[0].ToString()),
            D2002_ServiceCategory = int.Parse(Fields[1]),
            D2003_Schedule3 = decimal.Parse(Fields[2]),
            D2004_ExemptCustomerFlag = (Fields[3].ToString() == "1" 
                ? true : false),
            D2005_CustomerClassification = Fields[4].ToString(),
            D2006_29e = decimal.Parse(Fields[5]),
            D2007_LargeVolAgreement = (Fields[6].ToString() == "1" 
                ? true : false),
            D2008_SICCode = (Fields[7].ToString() == "" ? -1 : int.Parse(Fields[7].ToString())),
            D2011_RateableValue = decimal.Parse(Fields[8]),
            D2014_FarmCroft = Fields[9].ToString(),
            D2015_SPIDVacant = (Fields[10].ToString() == "1" 
                ? true : false),
            D2018_TroughsDrinkingBowls = (Fields[11].ToString() == "1" 
                ? true : false),

            D2019_WaterServicesToCaravans = (Fields[12].ToString() == "1" 
                ? true : false),

            D2020_OutsideTaps = (Fields[13].ToString() == "1"
                 ? true : false),
            D2022_TransitionalArrangements = (Fields[14].ToString() == "1"
                 ? true : false),

            D2024_Unmeasurable = (Fields[15].ToString() == "1"
                 ? true : false),
            D2025_NotifyDisconnection_Reconnection = Fields[16].ToString(),
        };
        System.Diagnostics.Debug.WriteLine(i);

        Add(water);
        Save();
    }
}
Noctis
  • 11,507
  • 3
  • 43
  • 82
John
  • 1,459
  • 4
  • 21
  • 45
  • PS: I never mentioned thinking its irrelevant to the main question. But, this is running Asynchronously using WebApi. My plan is to have it run in a window churning away and let users get on with other tasks like using the rest of the system while the server deals with the database batch update – John Nov 17 '13 at 05:37
  • `LINQ` does not mean `no looping`. – OneFineDay Nov 17 '13 at 05:39
  • What database are you using? – aquinas Nov 17 '13 at 05:47
  • Don I am aware of that thankyou. Aquinas my database is SQL Server instance running in Azure. I'm using Database First to Update my Model. The procedure above with the slow loop is from my Repository class. Many thanks for reading – John Nov 17 '13 at 05:49

1 Answers1

0

A couple of different ideas. If you REALLY want to use EF then look at this thread:

Improving bulk insert performance in Entity framework

The big thing is to not call Save() after every insert.

But, for the best possible performance, don't use entity framework for this task. Just put everything into a DataTable, and then use the SQLBulkCopy class. This is the kind of job it's built for. It should take a second or two to load all of your data.

Community
  • 1
  • 1
aquinas
  • 23,318
  • 5
  • 58
  • 81