I have these entries:
id | fooddescription
--------------------
1 | 'Mollusks, oyster, eastern (blue point), wild, raw'
2 | 'Mollusks, oyster, eastern (blue point), wild, boiled or steamed'
3 | 'Vegetable oil, olive'
4 | 'Vegetable oil, almond'
5 | 'Pumpkin, boiled, drained, with salt'
6 | 'Pumpkin leaves, boiled, drained, with salt'
I want to treat the first two entries as one because they only have different preparation method, and distinct the others. The words in the string are ordered from general to specific, and the last part (when it has many descriptions and ,
) is usually the preparation method that doesn't need distinction.
Desired result:
id | fooddescription
--------------------
1 | 'Mollusks, oyster, eastern (blue point), wild, '
3 | 'Vegetable oil, olive'
4 | 'Vegetable oil, almond'
5 | 'Pumpkin, boiled, drained, '
6 | 'Pumpkin leaves, boiled, drained, '
First I thought I could trim the string to remove the part after the last comma. So according to this MySQL answer, I made a postgres script:
SELECT reverse(
substring(reverse(fooddescription),
position(',' in reverse(fooddescription)))) as trimmed, count(*)
FROM food_name
GROUP BY trimmed HAVING COUNT(*)>0
I'll get this result:
'Mollusks, oyster, eastern (blue point), wild,'
'Vegetable oil,'
'Pumpkin, boiled, drained,'
'Pumpkin leaves, boiled, drained,'
"Vegetable oil," is not desirable and I couldn't keep the id
.
So my question is:
- How to judge the number of delimiters
,
and only trim the last part if there are multiple delimiters? - Also, is it possible to keep one
id
for each group afterGROUP BY
?