I am working on a project its built-in laravel 5.3. I have a table named names
. i have two columns, id
and u_names
. table names
contains's 4,550,000+ rows.. I have not messed with a huge data like this before. Now the problem is that this table contains duplicate records. Now I am asked to copy this and then delete this original table and create it again. After that import, the copied table
into a newly created original table with PHP's select distinct query, so every record will come as unique in the original table. I know to do this with foreach loop
, but don't know that what will happen if the records are above 4.5 million in number. and what configurations will be required for this like memory_limit
and max_exicution_time
etc.

- 59
- 1
- 7
-
Have you looked at `mysqldump`; do you have access to run queries in command line? – Manav Sep 28 '17 at 11:32
-
I am not sure I understand your question, but why do you need to involve PHP and the part about dropping the table and create a new table? Would https://stackoverflow.com/questions/3311903/remove-duplicate-rows-in-mysql or http://www.mysqltutorial.org/mysql-delete-duplicate-rows/ work? – rypskar Sep 28 '17 at 11:38
-
i would use `INSERT INTO ... SELECT` and let the database handle it completly ...read https://dev.mysql.com/doc/refman/5.7/en/insert.html – Raymond Nijland Sep 28 '17 at 11:54
-
@Mjh you wrote a too long answer, while your line `We'll use INSERT INTO ... SELECT` was enough to solve my problem. Thanks, Bro. definitely your answer solved my problem. – Muhammad Ibrar Oct 09 '19 at 13:46
1 Answers
The problem
Create a table that contains unique records, based on contents of another table.
The solution
we're going to use a few queries to migrate the data.
We won't use PHP or Laravel to do anything with the data (because we don't have to). There's no reason for any data to even reach PHP.
Querying the database directly is enough. You can still use Laravel's migrations to define the queries if you want to.
The attack plan
- Create a new, empty table
- Add a
unique
index so we can prevent duplicate records - Since names can be long, we'll slightly alter the table and add a
hash
of thename
column - this will make ourunique
index 20 bytes long, for every record. - To avoid manually inserting the unique key, we'll help ourselves with a trigger
- We'll use
INSERT INTO ... SELECT
syntax to move data from one table to another. We'll addIGNORE
clause to ignore errors caused by duplicate records.
The code
Create the table
CREATE TABLE new_names LIKE `names`;
Alter the table to add the unique key
Adding the unique key prevents duplicates from getting in. That way you never have to use DISTINCT
query
ALTER TABLE new_names ADD unique_id BINARY(20) DEFAULT NULL AFTER id;
ALTER TABLE new_names ADD unique(unique_id);
Create the trigger which updates the unique_id for us
DELIMITER $$
CREATE
TRIGGER `new_names_before_insert` BEFORE INSERT
ON `new_names`
FOR EACH ROW BEGIN
SET NEW.unique_id = UNHEX(SHA1(NEW.name));
END$$
DELIMITER ;
Migrate the data
INSERT IGNORE INTO new_users SELECT * FROM users;
IGNORE
means that if a duplicate is encountered, ignore it and go to next record.
Conclusion
No data reached PHP, therefore PHP won't be the bottleneck and you don't have to tamper with its memory settings.
Uniqueness is guaranteed because of the unique key. You will never have duplicate records in that table.
This migration will be as fast as possible because everything happens on the database server. The insert will be atomic (either all records will get in or none), therefore it will be committed to HDD quickly.
Your task
- Change the table and column names accordingly
- Run this from MySQL termnal OR create migrations and execute this via
DB::raw()
Good luck!

- 2,904
- 1
- 17
- 16