38 Million records - Deleting duplicate rows basis column name and keeping just one

Question

Disclaimer - I've read many fantastic questions & their answers already and have given it a try too. The only issue is that considering the database size, the system gets stuck at "loading" and it just sits there. By looking at the total number of rows, I've seen changes happening but its just not significant when it doesn't give a warning or do it in pieces. I do have a fair idea of tweaking available code and make it work but I'm not a full time/advance developer (yet!)

Question - I've been working on a database that comprises of product information but has duplicate values (was foolish to not make "Product code" column as unique while importing multiple CVS into the database). I need help with removing the duplicate basis the "product code" but want to "keep one" which has maximum information under "specification" column.

Database - MySQL Total records - 36 Million + Total columns - Not more than 15 (but are of less relevance) Problem - Multiple duplicate values basis "product code" but keep one that has maximum number of characters in the "specifications column"

Database details; Table name - pro

Columns names are as; productid - VARCHAR, manPartId - VARCHAR, specification - TEXT

So far, I've picked the following code from and have given it a try, but the system gets stuck at "loading" and nothing happens. I assume this is because of the huge number of records it has.

Code that I had tried running in the phpMyAdmin "SQL" section is;

--------------------------------------------
      delete pro
      from pro
      inner join (
         select max(productid) as lastId, manPartId
           from pro
          group by manPartId
         having count(*) > 1) duplic on duplic.manPartId = pro.manPartId
         where pro.productid < duplic.lastId;
--------------------------------------------

The above code has been tweaked from the original available at MySQL delete duplicate records but keep latest

Please do help and understand on where am I going wrong. Please also note that I do understand that above code only works around "deleting all but keeping one" and is not "keeping one basis total text available in Specification column".

Many thanks in advance!

EDIT - As per recommendations from aendeerei, I've made some edits to the details.

-------------------------------------------------------
productid  | manPartId    |  specification 
-------------------------------------------------------
1            ABC1           5MP camera, 2500 MaH, Steel body
2            ABC2           2MP camera, Steel body
3            ABC3           5MP, 6500 MaH, Red
4            ABC1           2500 MaH, Steel body
5            ABC2           5MP camera, plastic body
6            ABC4           5MP camera, 2500 MaH, Steel body
7            ABC5           15MP camera, 4500 MaH 
8            ABC2           5MP camera
9            ABC3           15MP, 6500 MaH, Blue body
10           ABC5           2500 MaH, Steel body
-------------------------------------------

In the above case, I'm looking at removing the duplicates basis manPartId but want to keep one record that has the maximum (characters) in the specification field.

After running the query, I'd like to see the following updated data having unique manPartId with max text under specification column;

-------------------------------------------------------
productid  | manPartId    |  specification 
---------------------------------------------------------------
1            ABC1           5MP camera, 2500 MaH, Steel body
5            ABC2           5MP camera, plastic body
6            ABC4           5MP camera, 2500 MaH, Steel body
7            ABC5           15MP camera, 4500 MaH, Long life
9            ABC3           15MP, 6500 MaH, Blue body
---------------------------------------------------------------

Please accept my apologies if its still unclear!

i suggest not using phpmyadmin run the code in a script or the mysql console. — , Jul 05 '17 at 04:36
better way is to export the table into a csv / text file and use shell script to modify the csv. URL: https://www.cyberciti.biz/faq/unix-linux-shell-removing-duplicate-lines/ — Viswanath Polaki, Jul 05 '17 at 05:07
What you exactly want is not understandable for me. Can you help me understand, please? You have duplicates and you speak about "product code". Is "product code" = "productid"? And: why do you want to use "specification" exactly, and how? Please make your question slim: what you want (with exact column names), then present exact column names, then the data types and then the values of one or two rows. The rest is really not needed (except maybe the sql statement you presented). I have code for you, but I don't know how to "map" it to your situation. So, help! :-) — , Jul 05 '17 at 05:08
@rtfm is right: phpmyadmin is a bad option for running your resource eating delete query. — , Jul 05 '17 at 05:17
@aendeerei - Thank you for the suggestions. As per the questions asked, "specification" column has data about products and there are columns where there is lesser data (total info about product) hence it plays an important part when we decide what Product code to delete and what to keep. I have made amendments in the question above and hopefully it makes more sense now. Also what do you recommend on using a better way to execute the query than directly doing in the phpmyadmin. Thanks! — Amit Agnihotri, Jul 05 '17 at 06:50
(note: the table name "pro" isn't very meaningful) Perhaps create a NEW table based on the definition of the existing one. Then insert to that new table using a select query that removes the unwanted rows. Check the new table is what you want to retain. Check it again. Drop your existing table (after you checked). Rename the new table. Add any missing indexes/constraints to avoid this nightmare in future. This may be easier and more reliable than a mass delete. https://dev.mysql.com/doc/refman/5.7/en/ansi-diff-select-into-table.html — Paul Maxwell, Jul 05 '17 at 07:20
Create a new table, with the missing PRIMARY KEY. Then use INSERT IGNORE...SELECT to get unique values only. — Strawberry, Jul 05 '17 at 07:30
+1 @Strawberry. And when SELECTing from the old table, order the records by LENGTH(description) DESC, so the longest descriptions will get inserted first, hence the smaller descriptions will be the ones ignored — Ivo P, Jul 05 '17 at 07:36
Thank you soo much for the comments and sharing more insights. Given my limited knowledge, it is coming out to be a little too technical for me. Appreciate your kind gesture and apologies for sounding so naive. I've added more details to the question above with example on what do i have and what kind of output i'm looking for. Many thanks in advance! @Strawberry - Do you suggest that i create a new table with the details as shared by you and import the CSV files once again? — Amit Agnihotri, Jul 05 '17 at 11:21
@AmitAgnihotri Yesterday you said "_Thank you for the suggestions. As per the questions asked_", then nothing more. That, only, was what I saw in your comment. So I interpreted like "Ok, this user don't want to change anything, even if I asked him to change". So I gone away. But much later I saw that you actually made a lot of changes, in order to help us understand better. That's why I say now: You have my full appreciation for that! — , Jul 06 '17 at 05:13
@AmitAgnihotri Hi. Have you resolved your task? Do you have questions? — , Jul 06 '17 at 20:25
@aendeerei Hello! I've had used the procedure (late last night) as shared by you through MySQL Workbench(which works much faster than Php MyAdmin). Although i've had multiple issues including "Error Code: 2013. Lost connection to MySQL server during query" which I resolved by increasing the time to the lines of 84000. I do have some questions for you too; — Amit Agnihotri, Jul 07 '17 at 03:56
@AmitAgnihotri In MySQL Workbench >> Preferences there are three options with the labels beginning with "_DBMS connection..._". It's about periods in seconds. Change them correspondingly in order to avoid such errors as yours (lost connection, etc...). Please feel free to ask me anything. — , Jul 07 '17 at 04:07
@AmitAgnihotri Well, I hope you'll ask me now, because I have to go soon. — , Jul 07 '17 at 04:19
@aendeerei When I ran step 2, records under "Protemp" copied to 28M only. I guess its to do with the script getting running out of time (is it?). Secondly when i ran the STEP 3, the rows in table "Pro" first went down from 36M to 31M and then back to 35.8M. Please note that the script stopped multiple times with "Error Code: 1206. The total number of locks exceeds the lock table size" and i ran the script again by hitting the button again. Was that the right choice? Many thanks! — Amit Agnihotri, Jul 07 '17 at 05:10
@AmitAgnihotri 1) No, is ok, it should be 28M <= 38M. Because you have `GROUP BY pro.manPartid` in step 2 statement. That means: get each `manPartid` only once and copy it in `proTmp`. — , Jul 07 '17 at 06:34
@AmitAgnihotri 2a) I think you see more records because of that `FROM (SELECT * FROM pro AS tmp) AS a`. That should be `FROM pro AS a`, but MySQL engine don't allow it. So, that `select` makes internally a temporary representation of data of `pro`. But remember: MySQL Workspace can also show you things, that don't exist anymore. After a very big operation like yours, better close and reopen connection again, so that all is refreshed correctly. — , Jul 07 '17 at 06:52
An influence on step 3 would also have the fact that table `pro` has relation(s) with other tables defined. 2b) Solution for the error: [MySQL: The total number of locks exceeds the lock table size](https://major.io/2010/02/16/mysql-the-total-number-of-locks-exceeds-the-lock-table-size-2/). — , Jul 07 '17 at 06:53
A very quick question, on how do i ensure that the STEP 3 is executed fully as it is giving me the error "Error Code: 1206. The total number of locks exceeds the lock table size". Basis your recommendation, I have also updated the "innodb_buffer_pool_size" to 1G in the Xampp/MySQL/Bin/My.cnf (configuration file). Thanks! — Amit Agnihotri, Jul 07 '17 at 07:09
3) "_Was that the right choice?_". Yes it was ok to click the button again. Technically, the step-queries achieve this in the end: that the duplicates with length of `specification` smaller than the maximum one are deleted. Even if records appear strangely as in (2a) above, they can only be duplicates of the existing records, and even with a maximum length of the `specification` field. — , Jul 07 '17 at 07:09
My advice: you run all steps without errors (!). Then, later today, we write a query to test for duplicates again: but this time for duplicates like (pseudocode): `... WHERE manPartid1 = manPartid2 AND specification1 = specification2` (1 and 2 meaning record one and record 2). — , Jul 07 '17 at 07:09
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148577/discussion-between-aendeerei-and-amit-agnihotri). — , Jul 07 '17 at 07:11
@AmitAgnihotri Have you updated it to 1G and still comes the same error (1206)? Then make it 2G, then 3G, etc... until you don't get that error anymore. But restart the "MySQL service" each time after a change in your MySQL config file. And be sure, that you change the property `innodb_buffer_pool_size` in the right config file! — , Jul 07 '17 at 07:21
@AmitAgnihotri Read the accepted answer from here: [What is the location of mysql client “.my.cnf” in XAMPP for Windows?](https://stackoverflow.com/questions/4292769/what-is-the-location-of-mysql-client-my-cnf-in-xampp-for-windows) — , Jul 07 '17 at 07:24
@AmitAgnihotri I must go now, so come here in exactly 8 hours from now. We'll speak in chat all the problems that you might still have and all solutions until the end. Ok? Bye. — , Jul 07 '17 at 07:26
@aendeerei Made all the change suggested by you. Increasing innodb_buffer_pool_size to 1G, did its magic and i didn't encounter the same issue anymore. Finally got the message that a total "375335 row(s) affected". That was the good news & the bad is that when I'm trying to assign Unique key to manPartId, it still says that it has a row that has duplicate value. Do i restart MySQL to see if the problem persist or is there any other better way around this. What i understand from the same is that my database column manPartId for table "Pro" still has some duplicate values in it. :( — Amit Agnihotri, Jul 07 '17 at 10:58
@AmitAgnihotri It is normal to still have duplicates, because there may have been records with the same `manPartid` and max-length of `specification`, BUT with different `specification` text! Read my `NOTA BENE` in my answer codes from yesterday. — , Jul 07 '17 at 13:54
@AmitAgnihotri Hi. Have you resolved the problem? Need any help? — , Jul 09 '17 at 21:53
@aendeerei I can't thank you enough for helping me with this project. People like yourselves, makes me feel motivated to learn (and make mistakes and learn from it). Thank you everyone in the community for helping me in resolving this. You guys are all awesome! — Amit Agnihotri, Jul 10 '17 at 11:55
@AmitAgnihotri You are welcome. But you still didn't tell if you need more help or if your tables are ok now... or completely empty :-)) — , Jul 10 '17 at 23:40
@aendeerei hahaha i assume that the steps that you helped me with got this sorted out. The tables still has duplicate data that you mentioned is because of manPartid having more than one record of same max-length of specifications. That i assume is causing this issue. I think if we create another query that now removes the duplicates but keeping the latest entry, will eventually help us sort this out completely. — Amit Agnihotri, Jul 14 '17 at 07:45
Well, I'm glad it worked until this part. So, what do you need exactly? — , Jul 14 '17 at 08:00
Can you please help me with a simple script that now does the following; Find & delete out how duplicate records basis manPartid but keep one record which has large productid (meaning it was added recently). Many thanks! — Amit Agnihotri, Jul 14 '17 at 09:01
@AmitAgnihotri I'm done. I reedited my answer. So, please read "**EDIT 1**" carefully and let it run. Let me know. Ciao — , Jul 14 '17 at 10:02
Thanks @aendeerei Let me give it a try and confirm. In the meanwhile, can you please look at https://stackoverflow.com/questions/45109313/pagination-taking-up-too-much-display-space-for-big-database-how-to-limit-disp and help me with it? — Amit Agnihotri, Jul 14 '17 at 18:12

score 1 · Answer 1 · answered Jul 05 '17 at 11:56

First, basis, find all parts longest length (Query #1)

SELECT 
      manPartID,
      MAX( CHAR_LENGTH( specification )) longestLength
   from
      pro
   group by
      manPartID

WITH that as the baseline, now look for all parts that have that same longest length. But in the case where there are more than one with the exact same length, you need to pick one, such as the first ProductID or the most recent ProductID to be kept... (Query #2)

SELECT
      p.manPartID,
      MAX( p.productid ) as ProductID
   from
      pro p
         JOIN
            ( Entire Query #1 above ) byLen
          ON p.manPartID = byLen.manPartID
          AND char_length( p.specification ) = byLen.LongestLength
   group by
      p.manPartID

So at this point, you have only one "ProductID" for a single "manPartID" based on the longest specification... Now, you can delete from the main table where it is NOT one of the above such as below. I am doing a LEFT JOIN to the #2 query because I want all records compared and delete only those that are NOT FOUND in the keep result set.

DELETE FROM Pro
   LEFT JOIN (entire query #2 above) Keep
      ON Pro.ProductID = Keep.ProductID
   where Keep.ProductID IS NULL

Now, on a table of 36 million records, you PROBABLY want to ensure the above works before blowing away your data. So instead of deleting, I would create a new secondary table of products and insert into that just to confirm you are getting what you are hoping for...

INSERT INTO SomeTempTable
SELECT p1.*
   from Pro p1
      JOIN ( query #2 above ) Keep
         ON p1.ProductID = Keep.ProductID

Notice this one is a JOIN (not left-join as used in the delete) as I WANT only those products I expect to keep

I am sure there are other elements on the table as described, so to help the query performance, I would have the following index on your "Pro"duct table.

(manPartID, specification, productID)

This way the work can be done off the indexes and not have to go through all the data pages for every record.

This is going to be a lot slower than just creating a new table. I Imagine it will take FOREVER — Strawberry, Jul 05 '17 at 12:33
@Strawberry, lover your optimism... Build something by not destroying though. What about historical data associated with such "productid" values. Not enough info of historical impact to tell the person ... just blow away your table and create a new one. Offering solutions to the question on info provided. And no, I doubt it would "take forever" as you claim... Even if the sub queries are done into temporary tables to save overhead. — DRapp, Jul 05 '17 at 12:50
Well, that's a fair point about preserving integrity - but I'm still not convinced performancewise. — Strawberry, Jul 05 '17 at 14:03

score 1 · Answer 2 · 2017-07-14T17:18:49.263

Well, I can't say much here. Just follow the steps (three + an intermediary one) and read my comments carefully. I have chosen a convenient way for you: to run a simple query per step. It can be done also in other way, e.g. using a stored procedure, or many. But it wouldn't be better for you, since your task is a one-occasion process and a very sensible one. It is better to have the control over all operation results.

You asked me in the comments, what should you use as interface for your task. Well, MySQL Workbench is a good one for such operations, but it breaks/freezes a lot. phpmyadmin? Hm... I use for now SequelPRO and I must say, that I really like it. Can it manage your task? I don't know. But for sure I know one which can: the best MySQL software I ever used - and I'll certainly buy it for personal use too - was SQLyog. An extremely powerful, stable and robust application. Especially when you're dealing with duplicates/exports of databases: it never disappoints.

I saw that you have VARCHAR as data type for the column productid. Make it like this:

`productid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)

And, if you want to never have duplicates in manPartid column, then create a UNIQUE index on it.

I also recommend you to keep a uniform naming convention. Like:

productId, or product_id, instead of productid
manPartId, or man_part_id, instead of manPartid

And give the name products to the products table.

Now, I structured my answer into two parts: "steps to follow" and "results". For each step I posted the corresponding step results.

Before you begin doing something:

MAKE A BACK-UP OF YOUR DATA!

I wish you good luck!

STEPS TO FOLLOW:

=================================================================
STEP 1:
=================================================================
Create a new table proTmp with the following columns:
- manPartid: definition identical with pro.manPartid
- maxLenSpec: maximum specification length of each pro.manPartid.
=================================================================

    CREATE TABLE `proTmp` (
      `manPartId` varchar(255) DEFAULT NULL,
      `maxLenSpec` bigint(20) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;



===============================================================
STEP 2:
===============================================================
- Truncate table proTmp;
- Get a dataset with all [pro.manPartid, pro.maxLenSpec] pairs;
- Store the dataset into table proTmp.
===============================================================

    TRUNCATE proTmp;
    INSERT INTO proTmp (
        SELECT 
            pro.manPartid
            , MAX(LENGTH(pro.specification)) AS maxLenSpec
        FROM pro
        GROUP BY pro.manPartid
    );



=============================================================
INTERMEDIARY STEP - JUST FOR TEST.
IT ONLY DISPLAYS THE RECORDS WHICH WILL BE DELETED IN STEP 3:
=============================================================
Left join tables pro and proTmp and display only the 
records with pro.lenSpec = proTmp.maxLenSpec.
- lenSpec: length of pro.specification
=============================================================


a) Get pro.*, pro.lenSpec and proTmp.* columns, ordered by pro.manPartid.
_________________________________________________________________________

    SELECT 
        a.*
        , LENGTH(a.specification) as lenSpec
        , b.*
    FROM pro AS a
    LEFT JOIN proTmp AS b ON b.manPartid = a.manPartid
    WHERE LENGTH(a.specification) = b.maxLenSpec
    ORDER BY a.manPartid;


b) Get only pro.productid column, ordered by pro.productid.
___________________________________________________________

    SELECT a.productid
    FROM pro AS a
    LEFT JOIN proTmp AS b ON b.manPartid = a.manPartid
    WHERE LENGTH(a.specification) = b.maxLenSpec
    ORDER BY a.productid;



====================================================================
STEP 3:
====================================================================
Delete all records from pro having pro.lenSpec != proTmp.maxLenSpec.
IMPORTANT: ordered by pro.productid !!!
====================================================================

    DELETE FROM pro
    WHERE 
        pro.productid NOT IN (
            SELECT a.productid
            FROM (SELECT * FROM pro AS tmp) AS a
            LEFT JOIN proTmp AS b ON b.manPartid = a.manPartid
            WHERE LENGTH(a.specification) = b.maxLenSpec
            ORDER BY a.productid
        );

THE RESULTS OF THE EXECUTED OPERATIONS:

------------------------------------------------------------------------------------------------------------------
NOTA BENE:
------------------------------------------------------------------------------------------------------------------
NOTICE THAT I ADDED A NEW RECORD INTO TABLE pro, WITH THE productid = 11 & manPartid = "ABC1". ITS specification 
COLUMN HAS THE SAME MAXIMUM LENGTH AS THE RECORD WITH THE productid = 1 & manPartid = "ABC1" !!! IN THE END, 
AFTER STEP 3, E:G: AFTER DELETION OF DUPLICATES, BOTH RECORDS SHOULD STILL EXIST IN TABLE pro, BECAUSE THEY BOTH
HAVE THE MAXIMUM LENGTH of specification COLUMN. THEREFORE, THERE WILL STILL EXIST SUCH DUPLICATES IN THE TABLE 
pro AFTER DELETION. IN ORDER TO DECIDE WHICH ONLY ONE OF THESE DUPLICATES SHOULD REMAIN IN THE TABLE, YOU MUST
THINK ABOUT SOME OTHER CONDITIONS AS THE ONES WE KNOW FROM YOU IN THIS MOMENT. BUT, FIRST THINGS FIRST...
SEE ALSO THE RESULTS AFTER RUNNING STEP 3.
------------------------------------------------------------------------------------------------------------------


=================================================================
CREATION SYNTAX AND CONTENT OF TABLE pro, USED BY ME:
=================================================================

CREATE TABLE `pro` (
  `productid` varchar(255) DEFAULT NULL,
  `manPartId` varchar(255) DEFAULT NULL,
  `specification` text
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


    --------------------------------------------------------
    productid   manPartId   specification
    --------------------------------------------------------
    1           ABC1        5MP camera, 2500 MaH, Steel body
    10          ABC5        2500 MaH, Steel body
    2           ABC2        2MP camera, Steel body
    3           ABC3        5MP, 6500 MaH, Red
    4           ABC1        2500 MaH, Steel body
    5           ABC2        5MP camera, plastic body
    6           ABC4        5MP camera, 2500 MaH, Steel body
    7           ABC5        15MP camera, 4500 MaH
    8           ABC2        5MP camera
    9           ABC3        15MP, 6500 MaH, Blue body
    11          ABC1        12345678901234567890123456789012


===============================================================
STEP 1 - RESULTS: Creation of table proTmp
===============================================================

Just the table proTmp was created, without any content.


===============================================================
STEP 2 - RESULTS: Table proTmp content
===============================================================

    ----------------------
    manPartId   maxLenSpec
    ----------------------
    ABC1        32
    ABC2        24
    ABC3        25
    ABC4        32
    ABC5        21


============================================================
INTERMEDIARY STEP RESULTS - JUST FOR TEST.
IT ONLY DISPLAYS THE RECORDS WHICH WILL BE DELETED IN STEP 3
============================================================


a) Get pro.*, pro.lenSpec and proTmp.* columns, ordered by pro.manPartid.
_________________________________________________________________________

    ----------------------------------------------------------------------------------------------
    productid   manPartId   specification                       lenSpec     manPartId   maxLenSpec
    ----------------------------------------------------------------------------------------------
    1           ABC1        5MP camera, 2500 MaH, Steel body    32          ABC1        32
    11          ABC1        12345678901234567890123456789012    32          ABC1        32
    5           ABC2        5MP camera, plastic body            24          ABC2        24
    9           ABC3        15MP, 6500 MaH, Blue body           25          ABC3        25
    6           ABC4        5MP camera, 2500 MaH, Steel body    32          ABC4        32
    7           ABC5        15MP camera, 4500 MaH               21          ABC5        21


b) Get only pro.productid column, ordered by pro.productid.
___________________________________________________________

    ---------
    productid
    ---------
    1
    11
    5
    6
    7
    9


===========================================================================================
STEP 3 - RESULTS: Table pro after deletion of all duplicates by the two conditions
===========================================================================================

From the log after running the DELETE query:
"No errors, 5 rows affected, taking 6.5 ms"

NOTA BENE: NOTICE THAT THERE ARE STILL TWO RECORDS WITH THE manPartid = "ABC1",
BECAUSE THEY BOTH HAD THE SAME MAXIMUM LENGTH OF THE specification COLUMN !!!

    --------------------------------------------------------
    productid   manPartId   specification
    --------------------------------------------------------
    1           ABC1        5MP camera, 2500 MaH, Steel body
    11          ABC1        12345678901234567890123456789012
    5           ABC2        5MP camera, plastic body
    9           ABC3        15MP, 6500 MaH, Blue body
    6           ABC4        5MP camera, 2500 MaH, Steel body
    7           ABC5        15MP camera, 4500 MaH

I hope it all works.

EDIT 1:

DELETE ALL RECORDS EXCEPT THE ONES WITH MAXIMAL `productid`:

STEP 1: THIS STEP MUST BE APPLIED. OTHERWISE YOU ARE RECEIVING A FALSE LIST OF THE RECORDS TO DELETE!:

Convert column productid from VARCHAR to:

`productid` bigint(20) unsigned NOT NULL

STEP 2: Run the following DELETE query.

DELETE FROM pro
WHERE pro.productid NOT IN (
    SELECT max(b.productid) AS maxPartid
    FROM (SELECT * FROM pro AS a) AS b
    GROUP BY b.manPartid
);

Done. Please read "**EDIT 1**" carefully and let it run. Good luck ;-) — , Jul 14 '17 at 09:56
For the users who downvote(d) my question: Please let me know the motive(s) of your downvotes, so that I can change my answer correspondingly. This way we can contribute together to the continuous improvement of this website. — , Jul 14 '17 at 17:14
I've used the above code but the issue is that it is taking too much time. Infact I ran this query last night and this has just removed 69 records and the query is still running. :( — Amit Agnihotri, Jul 16 '17 at 05:30
@AmitAgnihotri That's bad. Have you converted column `productid` to a `BIGINT` data type first?! — , Jul 16 '17 at 05:34
I realised that that productid is INT already and i had mentioned this as VARCHAR earlier, by mistake. Do you still want me to change this to BIGINT ? — Amit Agnihotri, Jul 16 '17 at 07:55
Make a backup of you db data first! Yes, make it BIGINT(20). After that make it PRIMARY KEY (with autoincrement if possible), or define a UNIQUE INDEX for it. Condition: it should not contain any duplicates. Then try again. — , Jul 16 '17 at 14:19
I had given it a go but the problem is that whole system gets stuck when i try and edit the productid to BIGINT. I tried both through phpmyadmin & though MySQL workbench. Seems like its rather better to start this from scratch. :( — Amit Agnihotri, Jul 18 '17 at 04:02
@AmitAgnihotri No, is not a better idea to begin from scratch! Forget BIGINT. You have INT for now. Just make `productid` a PRIMARY KEY. It takes a very long time, is normal. So let your system work on doing the PK index. — , Jul 18 '17 at 04:10
seeking your help with another on here at https://stackoverflow.com/q/45978176/8256826 — Amit Agnihotri, Aug 31 '17 at 09:40
Thank you for taking time to comment. Can I please request you to share the comments on the other page. Also, the tracking_id value is Auto increment that gets populated every time new data in inserted in the database. Do you have an email id, that i can send you more details on? Thanks! — Amit Agnihotri, Sep 01 '17 at 11:05