I am working with a MySQL database table that I do not own (i.e., I cannot change the layout/schema). One of the fields/columns has data that looks like this (5 records shown):
12345 - descriptive text
98765 - descriptive text
12345 - descriptive text; 3 x 67890 - descriptive text
12345 - descriptive text; 67890 - descriptive text
12345 - descriptive text; 2 x 67890 - descriptive text
In each record, this field contains one or two distinct 5-digit codes. I need to sum all instances of each specific 5-digit code. As you can see in the example above however, some records have more than one instance of a 5-digit code with the second 5-digit code optionally preceded by a multiplier. That said, based on the example, the 5-digit Code totals would be as follows:
5-digit Code | Count |
---|---|
12345 | 4 |
98765 | 1 |
67890 | 6 |
Since certain 5-digit codes may only appear by themselves in a record (not two different 5-digit codes delimited by a semicolon) and I have a list of these codes, I am able to sum the instances of these codes using the following SQL snippet for each specific code:
SELECT
COUNT(*),
SUM(
CASE
WHEN code LIKE '%12345%'
THEN 1
ELSE 0
END
) as 'count_12345', ...
The above code works great when there is only one 5-digit code per record. Unfortunately, it gets really ugly when there are two codes in the column delimited by a semicolon. If I was pre/post processing this field in code (C#, PHP, et al), I would simply split the field on the "; " and then parse the second substring to find the optional multiplier. As a relative SQL newbie however, I'm not sure how to go about performing similar string tests/manipulations, accompanied by the appropriate branching logic. Conceptually I understand how to do what I want but syntactically I have yet to find the right combination of SQL commands that yield the desired result when I execute a test query in phpMyAdmin (the environment doesn't provide much, if any, feedback when I enter invalid code).