I'm currently trying to delete duplicate rows in MySQL 5.7 (InnoDB) and am checking to see how many duplicates I have of a mediumtext
column by running SELECT COLUMN, COUNT(*) FROM TABLE GROUP BY COLUMN HAVING COUNT(*) > 1
. The most recent query returned:
[results]
31620 rows in set (17.98 sec)
If I run the exact same query a moment later, I get:
[results]
31594 rows in set (17.35 sec)
And so on. I get a different result almost every single time. Nothing is writing to the database during the query. It is doing this only with this query; SELECT COUNT(*) FROM TABLE
, SELECT COUNT(*) FROM TABLE WHERE COLUMN LIKE <VALUE>
, and so on, all produce consistent results. This error also does not occur when executing SELECT COLUMN, COUNT(*) FROM TABLE GROUP BY COLUMN HAVING COUNT(*) > 0
.
I'm not sure what other code to offer to help answer this, as this is the only query I'm running and I'm doing it right in the console. I'm trying to think of what could possibly account for this. Given the other problems I've had with this same database, I'm wondering if it's possible that something is corrupted.
Edit: I've run 1000 queries to sample the results and they come out like this:
The upper limit of 33991 is the most common result.
The charset of the table is utf8mb4
, and the collation of the column being aggregated is utf8mb4_general_ci
.
The output of EXPLAIN SELECT COLUMN, COUNT(*) FROM COLUMN GROUP BY COLUMN HAVING COUNT(*) > 1;
while using MyISAM:
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+---------------------------------+
| 1 | SIMPLE | TABLE | NULL | ALL | NULL | NULL | NULL | NULL | 788685 | 100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+---------------------------------+
Results for InnoDB:
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+---------------------------------+
| 1 | SIMPLE | TABLE | NULL | ALL | NULL | NULL | NULL | NULL | 769501 | 100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+---------------------------------+
Things I have tried so far as suggested in comments:
- Memtest, using the
memtest
Linux package and runningmemtest 15G 2
(system has 16G of memory with 15.4 available and about .4 in use. This is a cloud machine and I can't boot with Memtest, although I've put in a request with the provider to see if they can. - Enabling general log, which showed no other activity being run between queries.
- Using
OPTIMIZE TABLE
. - Dropping and re-adding the index.
- Changing the table engine to MyISAM from InnoDB, which did seem to help a little as the query now reaches a max limit after a few queries, but it still bounces around for the initial few queries.