Multi-currency REGEXP_SUBSTR Oracle11g
First time posting a question here, so I hope I don't muck it up too badly.
I created a query that pulls various customer order details including charged price, product list price and the big one, the price that was sent to customers via reminder emails.
Using REGEXP_SUBSTR I am able to match all of the prices from the emails HTML content in various currencies but I run into an issue with the output for certain Price currency abbreviation combinations lacking commas or periods: i.e. 123 kr, 999 Pesos or 1 050 Kč.
How can I make this scenario above match the output of the other price formats?
I pulled much of my "inspiration" from Gary's answer here: Regex currency validation.
Datasource HTML
Desired values starting at <!-- START Price Exp.. -->
:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Thank you for using us</title>
<style type="text/css">
.ReadMsgBody {
width: 100%;
}
.ExternalClass {
width: 100%;
}
BODY {
font-family: OpenSans, Arial, Helvetica, sans-serif;
font-size:13px;
color:#555555;
}
TD {
font-family: OpenSans, Arial, Helvetica, sans-serif;
font-size:13px;
color:#555555;
vertical-align: top;
}
A {
color: #f48024;
}
IMG {
display:block;
border: none;
}
H1 {
font-size: 18pt;
}
H2 {
font-size: 15pt;
}
H1, H2, H3, P, UL, LI {
margin: 0;
padding: 0;
}
</style>
</head>
<body style="margin: 0; padding: 0; background-color: #eeeeee" bgcolor="#eeeeee">
<table width="100%" border="0" cellpadding="0" cellspacing="0" style="margin: 0; padding: 0; ">
<tbody>
<tr>
<td align="center" width="100%" >
<!-- TOP-->
<table bgcolor="#eeeeee" border="0" cellpadding="0" cellspacing="0" style="background-color: #eeeeee; width:100%; max-width:900px; ">
<tbody>
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
</tbody>
</table>
<!-- END TOP-->
<!-- LOGO -->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; background-color: #fff; max-width:900px;">
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="background-color: #fff; text-align: center; max-width:650px;">
<tr>
<td align="center" height="40" bgcolor="#fff" style="background-color: #fff; vertical-align: middle; text-align: center; ">
<a href="https://www.company.com/" target="_blank"><img align="center" style="" src="https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon@2.png?v=73d79a89bded" style="display:block" alt="" /></a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<!-- END LOGO-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; max-width:900px; background-color: #fff;">
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; background-color: #fff; max-width:800px; padding-left:10px; padding-right:10px;">
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: center; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:22px; color:#41424e; line-height: 1.4"><b>Example Template</b></p>
</td>
</tr>
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Dear customername,</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Your productname - 1PC has been successfully renewed.</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Details of your sub below.</p>
</td>
</tr>
<tr>
<td height="40" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!-- sum-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#F2F2F6" style="width:100%; background-color: #F2F2F6; max-width:900px;">
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#F2F2F6" style="width:100%; background-color: #F2F2F6; max-width:800px; padding-left:10px; padding-right:10px;">
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: center; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:20px; color:#41424e; line-height: 1.4"><b>Your Auto-Renewal Sub</b></p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Product</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">Productname - 1 PC</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Order ID</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">12131415161</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Exp Prices charged</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<!--START Price Exp, templates could be in numerous different languages but info like i.e. customername, productname, Order ID, Tracking IDs will always use the same format. -->
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">$69.99 (a tax message)</p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">123 kr (b tax message)</p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">999 Pesos (c tax message)</p>
</td>
</tr>
<!--END Price Exps -->
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Automatically renewed</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">May 20, 2018</p>
</td>
</tr>
<tr>
<td height="42" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!--END sum -->
<!-- white-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; max-width:900px; background-color: #fff;">
<tr>
<td height="15" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; background-color: #fff; max-width:800px; padding-left:10px; padding-right:10px;">
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">If you’d like to check your order status, please sign in to <a href="https://www.company.com/en-us/order?pgm=6916670010" target="_blank">company.com/orders</a> with the login credentials below.</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Order ID:</b> 12131415161</p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Password:</b> stAcKoverFlOwrocks</p>
</td>
</tr>
<tr>
<td height="20" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Your Plan</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><strong>Auto-Renewal Terms</strong><p>By completing your purchase, you have authorized us to do a bunch of legal stuff.</p>
</td>
</tr>
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Need help?</b></p>
</td>
</tr>
<tr>
<td height="1" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td style="vertical-align: middle;">
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><a href="https://company.com/en_US/support" target="_blank">company.com/help</a></p>
</td>
</tr>
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">Thanks for trusting us.</p>
</td>
</tr>
<tr>
<td height="34" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!-- END white -->
<!--FOOTER-->
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#777684" style="width:100%; background-color: #E7E7EF; max-width:900px;">
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
<tr>
<td>
<table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#777684" style="width:100%; background-color: #E7E7EF; max-width:900px;">
<tr>
<td>
<table><tr><td>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:14px; color:#767683; line-height: 1.4">Trouble installing? <u><a href="https://www.company.com/en-us/faq.php" target="_blank">Visit FAQ</a></u></p>
<p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:14px; color:#767683; line-height: 1.4">Curious for more? <u><a href="https://www.company.com/en-us" target="_blank">Find more</a></u></p>
</td></tr></table>
</td>
<td>
<table align="right" >
</tr> </table>
</td>
</tr>
<tr>
<td height="30" style="font-size: 1px;"><!-- cell --></td>
</tr>
</table>
</td>
</tr>
</table>
<!--END FOOTER -->
</td>
</tr>
</tbody>
</table>
</body>
</html>
REGEX
(NT\$|SAR)\s(\d{2,5})|\d{1,4}([.,]\d{3})*([\s.,]\d{2,3}|[^\W]\d+(\d{1,4})*\s(kr|zł|Pesos|Kč|Ft|บาท|SAR|₪))
Matches all
- 1.4">$19.99 (some random text)
- 1.4">R$20.00
- 1.4">20.00€
- 1.4">€25,99
- 1.4">£15.99
- 1.4">123 kr
- 1.4">1234 Ft
- 1.4">999 Pesos
Output
- 19.99
- 20.00
- 20.00
- 25.99
- 15.99
- 123 kr
- 1234 Ft
- 999 Pesos
The last three examples should not have a space and/or any letters following the numbers.
How can I remove them from the output but retain the numbers?
I realize this is likely due to the multiple capture groups I have, so I see three potential solutions:
- Refine the Regex to eliminate the overuse of groups (I am not skilled enough to figure this out)...
- Somehow write Regex non-capturing groups that result in the desired output. I have sadly learned that (?:) simply doesn't work.
- Utilize the SQL function parameters to select subexpressions from REGEXP_SUBSTR. However, this doesn't appear to allow more than one subexpression in the output.
SQL
SELECT
REPLACE(REPLACE(REGEXP_SUBSTR(nnc.MESSAGE, '(NT\$|SAR)\s(\d{2,5})|\d{1,4}([.,]\d{3})*([\s.,]\d{2,3}|[^\W]\d+(\d{1,4})*\s(kr|zł|Pesos|Kč|Ft|บาท|SAR|₪))'),',','.'),' ','') AS EMAIL_PRICE_SENT
FROM tablename
WHERE clause;
That is the full statement with a couple nested REPLACE functions to format the output to the system format.
- See Regex in action here: Regex 101 Link.
I know this is the wrong language so it will not provide a 100% accurate test but I find it very helpful before running it against the DB. I am always open for better tool suggestions!
I have spent too much time than I am proud of on this, so any assistance would be much appreciated.
Thanks, Nick