0

I have only 1 table with 1 million rows that i need to update. The data what I need to update is in a file.txt and its only 1 value to update in 1 million rows.

I use PHP to read file and make the Update.

The problem is the time. Is so slow. Any idea to improve and make more fast?

What I have is:

UPDATE productos SET urlRedirect = getChangeProducto WHERE url LIKE 'getUrlProducto'

TXT file, 2 values separated by (;) first are getChangeProducto and second getUrlProducto:

https://www.example.com/guess/8595651?o=34&t=direct&f=102.301&ord=111;https://example.com/p/guess-jeans-116-175-cm-327554`

PHP FILE:

$mysqli = get_dbc();
    if ($mysqli->connect_errno) {
    echo "Failed to connect to MySQL: (".$mysqli->connect_errno.") ".$mysqli->connect_error;
    exit();
}
$mysqli->set_charset("utf8");

if (!$mysqli) exit(mysqli_connect_error());

$handle = fopen("fileToUpdate.txt", "r");
$total_lines = 0;

if ($handle) {
    $line = fgets($handle);
    while ($line !== false) {
        $url_to_delete = trim($line);
        $trozos = explode(";", $url_to_delete);
        $total_lines++;
        if ($stmt = $mysqli->prepare("UPDATE productos SET urlRedirect = ? WHERE url = ? ")) {
            $stmt->bind_param('ss', $trozos[1],$trozos[0]);
            $stmt->execute();   // Execute the prepared query.
        }
$stmt->close();
        $line = fgets($handle);
    }
    fclose($handle);
}
// Output — Total Lines: 6305.
echo "Total Lines: $total_lines.\n";
Dragonasu
  • 31
  • 7
  • 1
    Try dropping potential indexes on the table and recreating them after the update. This typically dramatically speeds up such operations, since the index does not have to be updated after each atomic change. – arkascha May 12 '21 at 09:00
  • 2
    Apart from that: using the `LIKE` operator always is extremely slow. And there is no reason in that SQL query you posted why you should have to use it. – arkascha May 12 '21 at 09:01
  • 1
    And last: try not to update each row one by one. Check if you cannot create a temporary table from the imported values and then execute an update over joined tables. That moves the computation from php to mysql which typically makes it much faster. – arkascha May 12 '21 at 09:02
  • if it's something common (in your DB) to have the same `urlRedirect` for many differents products, you may consider normalizing this data and use another table to store the URLs – Cid May 12 '21 at 09:04
  • Thank you @arkascha but I'm new and I dont know how to do that. Can you help me or tell me how can I do? – Dragonasu May 12 '21 at 09:04
  • And yeah, don't use `LIKE` operator if you don't plan to use wildcards, as far as I know, it's slower than `=` – Cid May 12 '21 at 09:05
  • It's unlikely that you're using a case-sensitive collation, but I guess we have to ask. – Strawberry May 12 '21 at 09:11
  • @Strawberry I dont using a case sensitive collation – Dragonasu May 12 '21 at 09:14
  • 4
    How about showing us the schema, an example of the input csv file and the PHP code you are using to do the update. Then we can try and help rather than play 20 questions – RiggsFolly May 12 '21 at 09:15
  • In which case `url = 'geturlproducto'` will be more efficient – Strawberry May 12 '21 at 09:17
  • @arkascha - Recreating the indexes would help _only if_ the `UPDATE` modified the indexed column(s). Otherwise, there is zero impact (at least for InnoDB). – Rick James May 12 '21 at 18:14
  • `LIKE 'getUrlProducto'` -- That does not look like a URL. Perhaps you are substituting a url-like string? Meanwhile, it will probably not update any rows. – Rick James May 12 '21 at 18:16
  • @Cid - MySQL Optimizes `x LIKE '...'` to `x = '...'` when there are no wildcards. – Rick James May 12 '21 at 18:18

1 Answers1

1

I also encountered this problem before. That was around 2009 or 2010 where I need to update multiple entries fast because of the request timeout. Some solution I found was to extend the request timeout configuration of the server but another solution I found was the create a temporary table on the database server and relate the two tables but unfortunately, I cannot share with you the link because that was a long time ago. So here what I did as long as I could recall:

  1. Create a temporary table with similar structure to your CSV bearing the primary key to which relates to the table you like to perform the update. I know MySQL has the temporary table capability.
  2. From your CSV, upload data by insert the record at once (by batch or multiple values on INSERT statement). This will be fast
  3. Once you have inserted all the record, relate the temporary to your table on UPDATE statement.
  4. Example: UPDATE orig_table ot, temp_table tt SET ot.URLRedirect=tt.URLRedirect WHERE ot.ID=tt.ID
  5. Destroy your temporary table.

The UPDATE statement will set the value of orig_table to the new value from temp_table provided that they relate through their ID.

paul
  • 96
  • 2