1

I wonder if anyone could assist? Table_A below shows an example product in my MySQL product database. I'm having problem working out the query logic to achieve Table_B below. (It's not the INSERT command I'm having a problem with.)

Table_A

 entry_id | SKU  | Product          | Parent_SKU  | IsVariant
----------------------------------------------------------------
     1    | ABC  | Green T-Shirt    |             | No
     2    | ABCS | Green T-Shirt S  | ABC         | Yes
     3    | ABCM | Green T-Shirt M  | ABC         | Yes
     4    | ABCL | Green T-Shirt L  | ABC         | Yes

Table_B

 child entry_id | parent entry_id
----------------------------------
        2       |       1
        3       |       1
        4       |       1

So far, here's the query I've been running.

INSERT INTO Table_B
(parent_entry_id, child_entry_id)

    SELECT
        Table_A.entry_id,
        (SELECT Table_A.entry_id FROM Table_A
         WHERE Table_A.SKU = Table_A.Parent_SKU
         AND Table_A.IsVariant = 'No')
    FROM Table_A
    WHERE IsVariant = 'Yes'

I know full well the logic is wrong, but after a day of trying to think it through my brain is frazzled. I've searched on here and other places as best as I can think, but without any thoughts or answers to help. So can anyone here assist? Should I be thinking the query the another way round?

  • Try taking a look at recursive queries. – Frank Allenby Sep 15 '11 at 11:57
  • Thanks @FrankAllenby. The best post I've found on the subject so far is [here](http://www.dbforums.com/db2/1610901-how-write-recursive-sql-join-i-am-not-sure-even-posible.html) as it seems to be on a related subject. I'm not quite sure how this will help me. – Justin Taylor Sep 15 '11 at 12:17

2 Answers2

0

Maybe I am missing something here, but

select child.entry_id, parent.entry_id 
from Table_A child
join Table_A parent
on (child.parent_sku = parent.sku)
where child.isVariant = 'Yes'
Thilo
  • 257,207
  • 101
  • 511
  • 656
  • So my full query now reads (as below) but the Error message reads Operand should contain 1 column(s) `INSERT INTO Table_B (parent_entry_id, parent_field_id, child_entry_id, rel_order) SELECT Table_A.entry_id, 364, (SELECT child.entry_id, parent.entry_id FROM Table_A child JOIN Table_A parent ON (child.field_id_364 = parent.field_id_20) WHERE child.field_id_363 = 'Yes'), 0 FROM Table_A WHERE Table_A.field_id_363 = 'Yes'` – Justin Taylor Sep 15 '11 at 12:12
  • your full query? That was not part of the question... Does the SELECT I show work as expected? – Thilo Sep 15 '11 at 12:21
  • Sorry @Thilo. It runs, but brings back 120,000 results (and the results don't match up). There are 14,000 entries (or SKUs) in the database. I'm expecting in the region of a much smaller selection of Parent/Child products. – Justin Taylor Sep 15 '11 at 12:33
0

If you have only need a single level of parent-child relationships (no grandchildren, grandgrand.., etc) then you can simply do a self join:

SELECT T.entry_id as child_id, T2.entry_id as parent_id

FROM TableA AS T LEFT JOIN TableA AS T2 

ON T.parent_SKU=T2.SKU

If you want to go trough the whole parent-child tree, Oracle and MSSQL have mechanisms that enable you to do that. I'm not familiar with such an option for MySQL.

There more info here: http://ftp.ntu.edu.tw/MySQL/tech-resources/articles/hierarchical-data.html http://stackoverflow.com/questions/53108/is-it-possible-to-make-a-recursive-sql-query

Edit: Ok, some data sanitation first: I do not know if this is an artefact of my import process or if it's the same at your end, but the parent_sku fields had a newline tacked on. Naturally this caused a failure trying to match a child's ABC parent_sku to a parents ABC\n SKU. Verify this with select * from TableA where parent_sku like '%\n' . If any columns match, run SET SQL_SAFE_UPDATES=0;

UPDATE TableA SET parent_sku = REPLACE(REPLACE(parent_sku, '\r', ''), '\n', '') ; This will remove the newlines. Same goes for empty string parent_sku's - if they aren't null at your end, make them null. Here is the revised query - I've added some pre-join filtering and it's faster now. Just double check your \n's and NULLs and it should produce the correct output:

SELECT child.entry_id as child_id, parent.entry_id as parent_id

FROM 

    (

        select * from TableA 

                -- only children have parent_sku
        where parent_SKU is not null

               -- some extra safety for empty strings, 
                -- can be removed after empties are replaced by NULLs
        and length(parent_SKU) > 0 and length(sku)>0 and length(parent_sku)>0

    )

        AS child 

LEFT JOIN 

    (

        select * from tableA 
                -- only parents don't have parent_sku
        where (parent_SKU is null or length(parent_SKU)=0) 

        and length(sku)>0

    ) AS parent



ON child.parent_SKU = parent.SKU

results

Noam Kremen
  • 388
  • 1
  • 14
  • Thanks Noam. As shown in my example (hopefully clearly!) it is only a parent-child relationship, as it's for products such as apparel or footwear where a Parent Item may have several Child items, such as size or colour options. – Justin Taylor Sep 15 '11 at 12:42
  • Noam, although I've tested this, it didn't work as I expected. Were you planning for me to input this within my subquery as in my query within my original question? – Justin Taylor Sep 15 '11 at 15:48
  • Yes, you still need to wrap it up inside the INSERT statement. – Noam Kremen Sep 15 '11 at 22:38
  • Thanks for your help, Noam. The code I've used is the following, but this just caused the query to run for about 30 minutes before I quit it - nothing was inserted into Table_B. `INSERT INTO Table_B (parent_entry_id, child_entry_id) (SELECT T.entry_id AS child_id, T2.entry_id AS parent_id FROM Table_A AS T LEFT JOIN Table_A AS T2 ON T.parent_sku = T2.sku)` Is there anything else you can suggest? – Justin Taylor Sep 16 '11 at 09:28
  • No, it brings a long result with child_id as 1 (and about 100 parent_id results), 2 (and about 100 parent_id results) etc, with parent_id as a 100 individual entry_id's which again recur. – Justin Taylor Sep 16 '11 at 17:35
  • Try replacing "ON T.parent_SKU=T2.SKU" with "ON IFNULL(T.parent_SKU,-1)=T2.SKU" . This should drop all cases where parent_SKU is null. I'm assuming the product SKU itself can never be null. – Noam Kremen Sep 16 '11 at 18:57
  • That's right, the product SKU is never null. Every product whether parent (which are usually 'fake' or non-) or child (the actual product in varying size or colour) has a SKU. Thank you for the revised query, but it brings the same results as the previous one. _starts to pull out clumps of hair...!_ – Justin Taylor Sep 16 '11 at 20:53
  • If you could post a larger data subset (maybe to pastebin.com), I might be able to help more. "Select Top 1000 entry_id, SKU, parent_SKU" should be enough and probably wouldn't contain any confidential data (unless SKUs are very descriptive) – Noam Kremen Sep 17 '11 at 07:41
  • Ran it, results are as expected: [link](http://pastebin.com/UQy4t6zP). Assuming that your data does not contain any anomalies, the only way to produce the incorrect results you describe is if you wrote FROM TableA AS T LEFT JOIN TableA AS T2 ON T2.parent_SKU=T.SKU Instead of the following correct order: FROM TableA AS T LEFT JOIN TableA AS T2 ON T.parent_SKU=T2.SKU Make sure that the table on the left side of the join is the parent_SKU table. Order is important in this case. – Noam Kremen Sep 17 '11 at 11:31
  • Just been running some tests, including duplicating everything I've written out here. This is DEFINITELY not working in my setup. I'm running the latest version of MySQL. The results I'm getting are here. http://pastebin.com/4dnkpK5v – Justin Taylor Sep 19 '11 at 13:28
  • Justin, can you post the table you're working on? Only the entry_id, SKU, parent_SKU fields are needed. – Noam Kremen Sep 19 '11 at 18:03
  • Noam, you're a hero! That was it! Thank you so much for your assistance with this matter. – Justin Taylor Sep 20 '11 at 08:38