-1

im trying to write a automation program that scrape data from site and update it in database every few seconds.. and i ran into a problem,

This is my Code :

 for (int i = 0; i < ips.Count; i++)
        {
            MySqlCommand command = con.CreateCommand();
            command.CommandText = "INSERT INTO proxiestb (ip,port) VALUES (@ip,@port)";
            command.Parameters.AddWithValue("@ip", ips[i].InnerText);
            command.Parameters.AddWithValue("@port", ports[i].InnerText);
            Console.WriteLine(ips[i].InnerText + ":" + ports[i].InnerText);
            command.ExecuteNonQuery();
        }

Its working but the problem is that i want to check for dulicate lines and remove it .. beacuse i want this for loop run agian and again every 2-3 minutes.. please help me to figure out what i need to add to my code in order to check for duplicates and then remove them before the loop run again , thanks.

Dylan9
  • 1
  • 1
    Does this answer your question? [Remove duplicate rows in MySQL](https://stackoverflow.com/questions/3311903/remove-duplicate-rows-in-mysql) – Luuk Nov 28 '21 at 15:45
  • actually two ways to control duplicate one is indexing to particular table on database and second one before insert check whether the data is available on your table by code. – senthilkumar2185 Nov 29 '21 at 07:23

1 Answers1

0

To prevent inserting duplicates you can blindly push every line to the DB and let it handle duplicates like this:

INSERT INTO proxiestb (ip,port) 
VALUES (@ip,@port)
WHERE NOT EXISTS (SELECT * FROM proxiestb WHERE ip = @ip AND port = @port)

You'll still need to clean up any duplicates you already had in production tables but the above should prevent new ones.

I don't know if it's appropriate for your table outside of this bit of code but a unique constraint may also be worth looking into.

Peter Csala
  • 17,736
  • 16
  • 35
  • 75
DCAggie
  • 144
  • 8