3

Okay so, I'm currently running this code to move a bunch of data from a pervasive database into an access database

public List<HBPData> LoadData()
    {

        loadConnect(); //<-- get's the Pervasive/Access string from a text file
        List<HBPData> listofhbpdata1 = new List<HBPData>();
        List<HBPData> listofhbpdata2 = new List<HBPData>();

            PsqlConnection myConnection = new PsqlConnection();
            myConnection.ConnectionString = PervasiveString;
            myConnection.Open();
            PsqlCommand myCommand = new PsqlCommand("Select NUMBER, CUST_NAME, PO_NO, REQD_DATE, PO_NO, CUST_PO_NO, ORD_DATE, STATUS FROM SALES_ORDER_HEADER WHERE ORD_DATE > 20120220 Order By ORD_DATE desc", myConnection);
            PsqlDataReader myreader = null;
            myreader = myCommand.ExecuteReader();

            while (myreader.Read())
            {
                HBPData DataEntity = new HBPData();
                DataEntity.NUMBER = (myreader["NUMBER"].ToString());
                DataEntity.CUST_NO = (myreader["CUST_NAME"].ToString()).Replace("'","");
                DataEntity.PO_NO = (myreader["PO_NO"].ToString());
                DataEntity.RequiredDateTime = (myreader["REQD_DATE"].ToString());
                DataEntity.Tag = (myreader["PO_NO"].ToString());
                DataEntity.Shape = (myreader["CUST_PO_NO"].ToString());
                DataEntity.ExpectedCompletion = myreader["ORD_DATE"].ToString().Substring(0, 4) + "/" + myreader["ORD_DATE"].ToString().Substring(4, 2) + "/" + myreader["ORD_DATE"].ToString().Substring(6, 2);
                DataEntity.MostRecentStatus = (myreader["STATUS"].ToString());
                listofhbpdata1.Add(DataEntity);
            }

            PsqlCommand myCommand1 = new PsqlCommand("Select NUMBER, RECNO, CODE, ORDD_DESCRIPTION, BVORDQTY FROM SALES_ORDER_DETAIL WHERE BVRVADDDATE > 20120220 AND (PROD_CODE = \'MET\' OR PROD_CODE = \'MDT\') Order By NUMBER desc", myConnection);
            PsqlDataReader myreader1 = null;
            myreader1 = myCommand1.ExecuteReader();

            while (myreader.Read()) 
            {
                HBPData DataEntity = new HBPData();
                DataEntity.NUMBER = (myreader1["NUMBER"].ToString());
                DataEntity.RECNO = (myreader1["RECNO"].ToString());
                DataEntity.CODE = (myreader1["CODE"].ToString());
                DataEntity.DESCRIPTION = (myreader1["ORDD_DESCRIPTION"].ToString());
                DataEntity.Quantity = (myreader1["BVORDQTY"].ToString());
                listofhbpdata2.Add(DataEntity);
            }

            myConnection.Close();
            myreader1.Close();
            myreader.Close();






            System.Data.OleDb.OleDbConnection myAccessConnection = new System.Data.OleDb.OleDbConnection();

            myAccessConnection.ConnectionString = AccessString;
            myAccessConnection.Open();
            System.Data.OleDb.OleDbCommand myAccessCommand3 = new System.Data.OleDb.OleDbCommand("delete from AllOrders", myAccessConnection);
            myAccessCommand3.ExecuteNonQuery();

            for (int i = 0; i < listofhbpdata2.Count(); ++i)
            {
                System.Data.OleDb.OleDbCommand myAccessCommand2 = new System.Data.OleDb.OleDbCommand("" +
                    "Insert into AllOrders VALUES('" +
                      listofhbpdata2[i].NUMBER + "'" + ",'" + listofhbpdata2[i].RECNO.ToString() + "'" +
                    ",'" + listofhbpdata2[i].CODE + "','" + listofhbpdata2[i].DESCRIPTION.Replace("\'", "F") + "'" +
                    ",'" + listofhbpdata2[i].Quantity + "')", myAccessConnection);
                myAccessCommand2.ExecuteNonQuery();
            }

            myAccessConnection.Close();

        return listofhbpdata1;
    }

Now,. If you look closely, I typo'd the 2nd reader, it should read while(myreader1.read()) ... i accidentally put myreader.read()

putting myreader.read() much to my surprise actually ran successfully... this is what kind of blew my mind,... i changed it to "myreader1.read()" and the run time of the code was almost double ...... , anyway, checking the database, all of the data was there.....

so, using common sense, i kind of figured well, it probably just does both sets of code every time it runs the first reader,

but then how come all of the data is there?

There's significantly less fields in the Sales_Order_Header than Sales_Order_Detail, if it's doing the reader for the first one, shouldn't it finish at the end of the header table and then stop? so why is all the data there?

Either way, the run time of this code is relatively slow, does anyone have any suggestions for improving my code?

Edit: Just to show that the 2nd reader is not infact returning false: Debugger

as you can see, the debugger has stepped inside of the reader

3 Answers3

4

Are you sure that you are getting the right data in the second call to myreader? Something doesn't look right: your looping through myreader which should be getting the data from your first SELECT statement, but your inner code references myreader1.

So the strange thing here is not that the second iteration should be faster than the first one: it is that the second iteration is returning you the data you expect.
So the question is: are you sure that in that second loop:

  • you are getting the expected number of iterations for all the records you expect from that second SELECT statement, say 5000 (as opposed to the number of records from the first statement, say 1000).

  • you are actually getting data for every records in the second SELECT statement, not just the same top record each time.

Regarding the second part of your question, how to improve the speed of your data transfer, I would recommend the following:

  • Adding data through the execution of individual INSERT statements is going to be slow.
    Have a look at this question for some very fast alternatives:
    Writing large number of records (bulk insert) to Access in .NET/C#

  • If you're doing a lot of work on your Access database, keep a connection permanently open to it instead of opening/closing it. For the reason why this can have a big impact on performance, see Where is the OLE DB connection pooling?.
    I usually create a table I call Dummy, with a single record in it (doesn't matter what it is), and then open a data reader on that table that I keep open until I shutdown the application. This ensures that the database lock file is kept in place and not created/deleted every time I do some operation on the database. You would be surprised at the effect this has on performance if you're doing a lot of operations on the database.

Community
  • 1
  • 1
Renaud Bompuis
  • 16,596
  • 4
  • 56
  • 86
  • All of the data is being transferred i have check thoroughly, i cannot leave a connection open to the database because A) It's on the fileserver at work (that will slow everyone else down) B) Other programs / applications use the same database and if i leave the connection open they will obtain speed issues of their own, and bulk insert you say ? maybe i'll give that a go :P –  Mar 30 '12 at 19:54
  • Keeping a link open to the Access database will not affect other users connected to it. Access is a file database. If no other connection is open, every time you open/close one, the ACE engine has to check the lock file and create/remove it, and that is __very__ time consuming. If the lock file is already created because you have kept a link open, the subsequent open/close will be very fast. No resources on the server are used, you could have 100 people connected to the dummy table. It won't affect performance in any way. The last person to close the dummy table will also remove the lock file. – Renaud Bompuis Mar 31 '12 at 02:25
  • Bulk insert / dummy table made my program work much much much faster, Thank you good sir. –  Mar 31 '12 at 19:25
1

I don't know why, But i guess i'll write out an answer to my own question.

Although i don't have a good answer for why the 2nd reader runs successfully (without losing data), i have a few ways of making this code run faster that weren't suggested

First Off ~

 while (myreader.Read()) 
        {
            HBPData DataEntity = new HBPData();
            DataEntity.NUMBER = (myreader1["NUMBER"].ToString());
            DataEntity.RECNO = (myreader1["RECNO"].ToString());
            DataEntity.CODE = (myreader1["CODE"].ToString());
            DataEntity.DESCRIPTION = (myreader1["ORDD_DESCRIPTION"].ToString());
            DataEntity.Quantity = (myreader1["BVORDQTY"].ToString());
            listofhbpdata2.Add(DataEntity);
        }

        myConnection.Close();
        myreader1.Close();
        myreader.Close();






        System.Data.OleDb.OleDbConnection myAccessConnection = new System.Data.OleDb.OleDbConnection();

        myAccessConnection.ConnectionString = AccessString;
        myAccessConnection.Open();
        System.Data.OleDb.OleDbCommand myAccessCommand3 = new System.Data.OleDb.OleDbCommand("delete from AllOrders", myAccessConnection);
        myAccessCommand3.ExecuteNonQuery();

        for (int i = 0; i < listofhbpdata2.Count(); ++i)
        {
            System.Data.OleDb.OleDbCommand myAccessCommand2 = new System.Data.OleDb.OleDbCommand("" +
                "Insert into AllOrders VALUES('" +
                  listofhbpdata2[i].NUMBER + "'" + ",'" + listofhbpdata2[i].RECNO.ToString() + "'" +
                ",'" + listofhbpdata2[i].CODE + "','" + listofhbpdata2[i].DESCRIPTION.Replace("\'", "F") + "'" +
                ",'" + listofhbpdata2[i].Quantity + "')", myAccessConnection);
            myAccessCommand2.ExecuteNonQuery();
        }

This code is Redundant for two reasons :

  • I should be adding to the database within the reader as oppose to creating a for loop that goes through a list that i created which isn't being used for anything else

  • I'm emptying the table and refilling it with the same data + the extra data, when i should be checking if what i'm inserting already exists and then inserting only rows that don't currently exist

I have replaced that code, with this:

while (myreader1.Read())
                    {
   System.Data.OleDb.OleDbCommand myAccessCommand2 = new System.Data.OleDb.OleDbCommand(
   "INSERT INTO AllOrders(OrderNumber,RecordNumber,Code, Description, Quantity) " +
   "SELECT TOP 1 '" + (myreader1["NUMBER"].ToString()) + "'" + ",'" + myreader1["RECNO"].ToString() + "'" +
   ",'" + (myreader1["CODE"].ToString()) + "','" + (myreader1["ORDD_DESCRIPTION"].ToString()).Replace("\'", "F") + "'" +
   ",'" + (myreader1["BVORDQTY"].ToString()) + "'" +
   " from AllOrders " +
   "WHERE NOT EXISTS(SELECT TOP 1 OrderNumber FROM AllOrders Where OrderNumber = '" + myreader1["NUMBER"].ToString() +"')", myAccessConnection);
                        myAccessCommand2.ExecuteNonQuery();

                    }

Now,

Even though running myreader.read seemed to be going faster, i replaced it with myreader1 just in case it's doing something bad that i couldn't find

Now it runs much much faster. I didn't bother using DAO like suggested in Writing large number of records (bulk insert) to Access in .NET/C#

Because i'm already using system.data.OleDb

Community
  • 1
  • 1
1

Can you give us the lines that run slowest?

Try using blocks to work with IDisposable objects, such as DB Connections. This way, you'll be safe in presence of exceptions. And you won't need to be explicit about calling Close().

For loop has a lot of string addition. Try a StringBuilder instead.

GregC
  • 7,737
  • 2
  • 53
  • 67
  • If you're referring to try catch blocks, i previously had a try catch block on this, i removed it because i assumed it would consume resources, and it doesn't throw any exceptions –  Mar 26 '12 at 15:34
  • Database access can and will throw. Since you're not in control of SQL engine code (I assume). If it throws, you'll leak major resources, such as DB connections. – GregC Mar 26 '12 at 15:37
  • Oh.. The line that runs the slowest is .Open on the pervasive database Connection, i doubt there's anything i can do about that <. –  Mar 26 '12 at 15:39
  • If it throws while reading from the pervasive database nothing bad will happen , if it throws while writing to the access database, when it gets run again it calls delete anyway ? i'm kinda new to this stuff, maybe i'll re-add it –  Mar 26 '12 at 15:43
  • @JustinKirk try splitting your code into more methods. Maybe scope rules will make it more apparent where you have a typo. "Single method, single responsibility." – GregC Mar 26 '12 at 17:08
  • The only typo is noted in my post, this code compiles successfully, but according to logic it shouldn't! because 'myreader' is called on the Second reader and already finished reading after the first reader ends, If i split this into more methods it will make it impossible to read (on this forum atleast), but the real question is, how come the 2nd reader doesn't throw some sort of error?, and it works twice as fast as using myreader1 (which seems like what it is suppose to be) –  Mar 26 '12 at 18:03
  • @JustinKirk I am not interested in why it doesn't work correctly so much as I am interested in what I can do to make sure the question about typos enters my head. As in, how can I make the compiler see that I am doing something silly. I am sure there's some lazily-evaluated stuff under the hood that does not need to initialize again. Magic, I say, pure magic. – GregC Mar 26 '12 at 23:08
  • @GreC the compiler could see that the you're doing something wrong if you put the readers in a using block. But simply copy pasting some block of code and not adjusting all references is imo the root cause. It has nothing to do with compiler checking nor logic. – Martin Mar 31 '12 at 14:22