0

[New to asking questions] I'm trying to split multiple fields by the common delimited " " using SUBSTRING_INDEX and UNION ALL, which I've completed successfully from other stack overflow answers.

In another example I've also managed to INNER-JOIN another table (product_info)

The problem is when bringing the two together. I'm not familiar with the UNIONS (or INNER JOIN for that matter). So achieve the expected results have been difficult.

Some of the fields contain more than 1 item, which I understand is poor management in itself. Where NITM contains the item, the other fields contain the qty and price of each corresponding item delimited by a space.

example of code in table 'salesimport':

    ID  |       NITM        |  NQTY  |    SALE_PRICE     |
    1   |  CAP LIGHT CHAIR  | 1 1 2  |  2.99 4.99 44.99  |
    2   |      LIGHT        |   2    |       4.99        |
    3   |      CHAIR        |   4    |      44.99        |

example of code in table 'product_info':

    ID  |    PROD_ID   |  UNIT_COST  |  SUPPLIER  |
    25  |    CAP       |    1.00     |    X&Y     |
    87  |    LIGHT     |    1.23     |    X&Y     |
    79  |    CHAIR     |    9.00     |  JONES_CO  |

Just using the first spit method with SUBSTRING_INDEX and UNION ALL, each item in the order is entered onto a new row, e.g. ID 1, 1, 1, 2, 3 etc.

But when I try to add the following:

    INNER JOIN product_info
    ON salesimport.NITM = product_info.PROD_ID

The order lines become duplicated again, something similar to the example below, where each item gets every permutation.

    ID  |    ITEM    |  QTY  |    PRICE     |  UNIT_COST  |  SUPPLIER |
    1   |    CAP     |   1   |     2.99     |    1.00     |    X&Y    |
    1   |    CAP     |   1   |     2.99     |    1.23     |  JONES_CO |
    1   |   LIGHT    |   1   |     4.99     |    1.00     |    X&Y    |
    1   |   LIGHT    |   1   |     4.99     |    1.23     |  JONES_CO |

Here is my working code:

SELECT *,
       Substring_index(Substring_index(nitm, ' ', n.digit + 1), ' ', -1)
       AS ITEM,
       Substring_index(Substring_index(nqty, ' ', n.digit + 1), ' ', -1)
       AS QTY,
       Substring_index(Substring_index(sale_price, ' ', n.digit + 1), ' ', -1)
       AS PRICE,
       Format(Substring_index(Substring_index(nqty, ' ', n.digit + 1), ' ', -1)
              *
              Substring_index(Substring_index(sale_price, ' ', n.digit + 1), ' '
              , -1),
       2)
       AS SUBTOTAL,
FROM   salesimport
       INNER JOIN (SELECT 0 digit
                   UNION ALL
                   SELECT 1
                   UNION ALL
                   SELECT 2
                   UNION ALL
                   SELECT 3
                   UNION ALL
                   SELECT 4) AS n
               ON Length(Replace(nitm, ' ', '')) <= Length(nitm) - n.digit

example of above code:

    ID  |    ITEM    |  QTY  |    PRICE     |
    1   |    CAP     |   1   |     2.99     |
    1   |   LIGHT    |   1   |     4.99     |
    1   |   CHAIR    |   2   |    44.99     |
    2   |   LIGHT    |   2   |     4.99     |
    3   |   CHAIR    |   4   |    44.99     |

This is how I would like the final output to look:

    ID  |    ITEM    |  QTY  |    PRICE     |  UNIT_COST  |  SUPPLIER |
    1   |    CAP     |   1   |     2.99     |    1.00     |    X&Y    |
    1   |   LIGHT    |   1   |     4.99     |    1.23     |    X&Y    |
    1   |   CHAIR    |   2   |    44.99     |    9.00     |  JONES_CO |
    2   |   LIGHT    |   2   |     4.99     |    1.23     |    X&Y    |
    3   |   CHAIR    |   4   |    44.99     |    9.00     |  JONES_CO |

Sorry for any errors in advance

Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
jumpOnCommand
  • 97
  • 2
  • 8

1 Answers1

1

If time permits, I would rather normalize the data, instead of utilizing inefficient string operations. Read: Is storing a delimited list in a database column really that bad?

Anyways, in your case, you can utilize a Derived Table to JOIN here:

SELECT *
FROM   product_info AS pi
       JOIN (SELECT *,
                    Substring_index(Substring_index(nitm, ' ', n.digit + 1), ' '
                    , -1)
                                                   AS ITEM,
                    Substring_index(Substring_index(nqty, ' ', n.digit + 1), ' '
                    , -1)
                                                   AS QTY,
                    Substring_index(Substring_index(sale_price, ' ', n.digit + 1
                                    ), ' ',
                    -1)
                                                   AS PRICE,
                    Format(Substring_index(Substring_index(nqty, ' ',
                                           n.digit + 1), ' '
                           , -1)
                           *
                           Substring_index(Substring_index(sale_price, ' ',
                                           n.digit + 1
                                           ),
                           ' ', -1),
                    2)
                                                   AS SUBTOTAL,
             FROM   salesimport
                    INNER JOIN (SELECT 0 digit
                                UNION ALL
                                SELECT 1
                                UNION ALL
                                SELECT 2
                                UNION ALL
                                SELECT 3
                                UNION ALL
                                SELECT 4) AS n
                            ON Length(Replace(nitm, ' ', '')) <=
                               Length(nitm) - n.digit
                                      ) AS dt
         ON dt.nitm = pi.prod_id
Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
  • Thank you very much. I'll look at the links, esp Derived Tables. However, the results return the product_info table with salesimport INNER JOIN. Is it possible to reverse it the other way? To select * from salesimport, INNER JOIN product_info? – jumpOnCommand Aug 23 '19 at 12:32
  • @jumpOnCommand if all good, you should consider Upvoting (up arrow) and Marking the answer as accepted (green tick), for closure. – Madhur Bhaiya Aug 23 '19 at 12:35
  • @jumpOnCommand "To select * from salesimport, INNER JOIN product_info" can be done; give it a try using Derived Tables. – Madhur Bhaiya Aug 23 '19 at 12:44
  • I'm sorry, I've been trying for an hour now, and I can't seem to get a result which is based on the salesimport table. The salesimport table has 50k of sales orders. The product_info table contains a few hundred products. I'm looking to include the UNIT_COST and SUPPLIER fields to the salesimport table. More help would be appreciated. – jumpOnCommand Aug 23 '19 at 13:26
  • @jumpOnCommand please set up a [DB fiddle](https://www.db-fiddle.com/) with some sample data. Read this: https://meta.stackoverflow.com/q/333952/2469308 – Madhur Bhaiya Aug 23 '19 at 13:31
  • Thanks for your continued help. I'd never wrote a table manually or a Fiddle. I'll be sure to do this in future questions. (should I start a new post simplifying the question with the fiddle?) – jumpOnCommand Aug 24 '19 at 12:01
  • Here is the fiddle working showing the delimited rows. This works as expected. https://www.db-fiddle.com/f/iGDLtsuvZrUwZHrGmoux1U/2 – jumpOnCommand Aug 24 '19 at 12:03
  • However, once I JOIN the product_info table for obtain the SUPPLIER and UNIT_COST, the rows become duplicated. There should only be ID rows; 1, 1, 1, 2, 3 for the 5 items sold. https://www.db-fiddle.com/f/9jLtMgXB1MVHQvGXr3Yb8h/0 – jumpOnCommand Aug 24 '19 at 12:05
  • @jumpOnCommand Please check this fiddle: https://www.db-fiddle.com/f/9jLtMgXB1MVHQvGXr3Yb8h/1 You basically need two level of derived tables. – Madhur Bhaiya Aug 27 '19 at 07:32