0

So I'm in Teradata trying to pull any products that have more than 1 color-related name, as seen in the code snippet here:

SELECT
pt.product_number,
COUNT (CASE WHEN ot.option_name like any ('%green%', '%red%', '%blue%') THEN 1 ELSE NULL END) as differentColorCount
FROM product_table pt
JOIN option_table ot on ot.product_num = pt.product_num
HAVING differentColorCount > 1
GROUP BY 1

This is running fine, but the problem that I'm realizing is that a product might have a hundred different "Red" options for instance. (Red-1, Red-2, Red-3, etc). But I only want a count of when two of the different color strings are present for a single product.

So instead of LIKE ANY what I really need is LIKE ANY TWO. If both Red AND Green are present, count 1. If both Blue AND Purple are present, count 1.

I realize I could do a really long list where I do dozens of LIKE ALLs in every possible combination, but that doesn't seem like it will scale well if I need to check for, say 100 different colors instead of 6?

If anyone has any insight on this I would be incredibly grateful. Thanks in advance for any help you can offer! :)

ender777
  • 43
  • 6
  • You shouldn't be storing multiple options in a single string. – Gordon Linoff Nov 01 '17 at 19:51
  • what does your database look like? To Gordon's point are you storing all values in a single field? – David Lee Nov 01 '17 at 19:52
  • if as gordon suggest you are storing multiple values in `option_name` you need convert that string to multiple rows https://stackoverflow.com/questions/29105836/turning-a-comma-separated-string-into-individual-rows-in-teradata – Juan Carlos Oropeza Nov 01 '17 at 19:56
  • No, the options are broken out in another table, I just removed some JOIN clutter to try to make the issue a little clearer. Basically I'm checking the Option Names in the break-out table for multiple colors, for each product. – ender777 Nov 01 '17 at 19:58
  • I pasted a fuller version with the table JOIN at the end of the question. Thanks! – ender777 Nov 01 '17 at 19:58

1 Answers1

0

You can utilize a regular expression to extract the color and then apply a distinct count:

Count (DISTINCT RegExp_Substr(option_name, '(green|red|blue)')) AS differentColorCount

This is similar to your like any ('%green%', '%red%', '%blue%'), but returns the actual matching color instead of TRUE/FALSE.

The'(green|red|blue)' search pattern seperates defines three alternative search strings and returns the first match.

dnoeth
  • 59,503
  • 4
  • 39
  • 56
  • I'm not very familiar with RegExp_Substr, is it creating an array and then cycling through them looking for a given number of matches within the array? – ender777 Nov 01 '17 at 20:36
  • @ender777: This extracts one of those listed substrings when it's found, simply run it without the count to check the result. – dnoeth Nov 01 '17 at 20:40
  • That makes sense, thanks again! Heading to Google for an hour to two (or five) to learn RegExp. :) – ender777 Nov 01 '17 at 20:41
  • 1
    @ender777: caution, this might end in 100++ hours, Regular Expressions are a programming language :-) – dnoeth Nov 01 '17 at 20:43