0

I am writing a program to iterate through a database table (1.4 million rows), selecting information, making a web request to geocode from Google then I want to store the longitude and latitude in the same row I got the address from.

I think there might be a better way than I have come up with as I have two connections opened to the database and the opening and closing of the connection each time is most probably going to give me some overhead.

I am aware that I could use a JSON return instead ox XML and I might change to that but I would like to figure out the database connection issue first and then a way to make this program wait for the enforced 24 hour wait period to elapse before re-using the API keys?

So here is what I have so far, I would be very grateful for any insights any of you have.

string ApiKey = apikey[a];  //I have a couple of API keys in an array
while (a < 7)
{
    using (SqlConnection conn = new SqlConnection(connection))
    {
        for (int j = 1; j <= 1463758; j++)
        {
            if (count % 2498 == 0) //2500 requests per API per day
            {
                a = a + 1;
            }
            SqlCommand cmd = new SqlCommand("SELECT AddressToGeoCode FROM testDB WHERE id = @id AND Status = 'Not Done'", conn);
            try
            {
                conn.Open();
                cmd.Parameters.AddWithValue("@id", j);
                SqlDataReader reader = cmd.ExecuteReader();
                while (reader.Read())
                {
                    string address = reader[0].ToString();

                    var requestUri = string.Format("https://maps.googleapis.com/maps/api/geocode/xml?address={0}&key={1}", Uri.EscapeDataString(address), ApiKey);

                    var request = WebRequest.Create(requestUri);
                    var response = request.GetResponse();
                    var xdoc = XDocument.Load(response.GetResponseStream());

                    string status = xdoc.Element("GeocodeResponse").Element("status").Value;
                    var result = xdoc.Element("GeocodeResponse").Element("result");
                    var locationType = result.Element("geometry").Element("location_type").Value;
                    var locationElement = result.Element("geometry").Element("location");
                    float lat = float.Parse(locationElement.Element("lat").Value.ToString());
                    float lng = float.Parse(locationElement.Element("lng").Value.ToString());
                    Console.WriteLine("Api key number: " + a + ", " + apikey[a]);
                    Console.WriteLine("status: " + status);
                    Console.WriteLine("Coordinates: " + lat + ", " + lng);

                    using (SqlConnection conn2 = new SqlConnection(connection))
                    {
                        SqlCommand cmd2 = new SqlCommand("UPDATE testDB SET Lat = @lat, Long = @lng, LocationType = @lType WHERE id = @id", conn2);
                        try
                        {
                            conn2.Open();
                            cmd2.Parameters.AddWithValue("@id", j);
                            cmd2.Parameters.AddWithValue("@lat", lat);
                            cmd2.Parameters.AddWithValue("@lng", lng);
                            cmd2.Parameters.AddWithValue("@lType", locationType);
                            cmd2.ExecuteNonQuery();
                            conn2.Close();
                        }
                        catch (Exception ex)
                        {
                            Library.WriteErrorLog("Update exception " + ex.ToString());
                        }
                    }
                    count = count + 1;
                }
                conn.Close();
            }
            catch (Exception e)
            {
                Library.WriteErrorLog("Update exception " + .ToString());
            };
        }

        Console.Read();
    }
}
Eugene Podskal
  • 10,270
  • 5
  • 31
  • 53
Dren79
  • 89
  • 6
  • Instead of all these nested try catch blocks you should of made use of using. – JonH Mar 11 '16 at 19:30
  • Not an answer to your question but since you are using pass through sql and AddWithValue you need to understand that it will sometimes interpret the datatype incorrectly. It is better to specify the datatype explicitly when using pass through queries like this. http://blogs.msmvps.com/jcoehoorn/blog/2014/05/12/can-we-stop-using-addwithvalue-already/ – Sean Lange Mar 11 '16 at 19:32
  • Hi JonH, thats is a good point, would I still not have to have two separate usings? – Dren79 Mar 11 '16 at 19:41
  • Hey Sean, I didn't know that, I will modify the code to include specifying data types. – Dren79 Mar 11 '16 at 19:43
  • The first connection open and close can be put outside the for loop.. Regarding 24 hours, you can schedule this in Task Scheduler as a task that runs every day once at a particular time.By the way looks like you only want to process 2500 records every 24 hours which is not a lot. – nobody Mar 11 '16 at 19:46
  • Hi inquisitive_mind If I move the first open close outside the for loop the connection will timeout? The variable a that is incremented hrere`if (count % 2498 == 0) //2500 requests per API per day { a = a + 1; } ` moves to the next API key which adds another 2498 for each key. If I use the scheduler I will need to find the first record that has not been done and use that as a seed point to start. Now if I get enough Keys that it ran for longer than 24 hours or if we had network issues, what would happen then? – Dren79 Mar 11 '16 at 19:56
  • 1
    Better on Code Review? – Andy Wiesendanger Mar 11 '16 at 19:56
  • I'm voting to close this question as "Too broad" as there is no specific problem to address. If you are looking for improvements to working code, then [Code Review](http://codereview.stackexchange.com/help) would be a good fit for this question, just make sure to look over the help center first. – Phrancis Mar 11 '16 at 20:04

1 Answers1

0

it's much more efficient to update all rows with one SQL Query rather than update each row in the loop.

you can insert all items that you want to update in a tempTable and than make an update join statement and execute it outside of the loop.

bulk-record-update-with-sql is an example

Community
  • 1
  • 1
esiprogrammer
  • 1,438
  • 1
  • 17
  • 22