0

Multi-currency REGEXP_SUBSTR Oracle11g

First time posting a question here, so I hope I don't muck it up too badly.

I created a query that pulls various customer order details including charged price, product list price and the big one, the price that was sent to customers via reminder emails.

Using REGEXP_SUBSTR I am able to match all of the prices from the emails HTML content in various currencies but I run into an issue with the output for certain Price currency abbreviation combinations lacking commas or periods: i.e. 123 kr, 999 Pesos or 1 050 Kč.

How can I make this scenario above match the output of the other price formats?

I pulled much of my "inspiration" from Gary's answer here: Regex currency validation.

Datasource HTML

Desired values starting at <!-- START Price Exp.. -->:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
        <head>
            <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

            <title>Thank you for using us</title>
            <style type="text/css">
                .ReadMsgBody {
                    width: 100%;
                }
                .ExternalClass {
                    width: 100%;
                }
                BODY {
                    font-family: OpenSans, Arial, Helvetica, sans-serif;
                    font-size:13px;
                    color:#555555;

                }
                TD {
                    font-family: OpenSans, Arial, Helvetica, sans-serif;
                    font-size:13px;
                    color:#555555;
                    vertical-align: top;
                }
                A {
                    color: #f48024;

                }
                IMG {
                    display:block;
                    border: none;
                }
                H1 {
                    font-size: 18pt;

                }
                H2 {
                    font-size: 15pt; 
                }
                H1, H2, H3, P, UL, LI {
                    margin: 0;
                    padding: 0;
                }                  
            </style>
        </head>

        <body style="margin: 0; padding: 0; background-color: #eeeeee" bgcolor="#eeeeee">

            <table width="100%"  border="0" cellpadding="0" cellspacing="0" style="margin: 0; padding: 0; ">
                <tbody>
                    <tr>
                        <td align="center" width="100%" >

                     <!-- TOP-->
                            <table bgcolor="#eeeeee" border="0" cellpadding="0" cellspacing="0" style="background-color: #eeeeee; width:100%; max-width:900px; ">                           
                                <tbody>

                                    <tr>
                                        <td height="34" style="font-size: 1px;"><!-- cell --></td>
                                    </tr>
                                </tbody>
                            </table>
                            <!-- END TOP-->

                            <!-- LOGO -->
                            <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%;  background-color: #fff; max-width:900px;">
                                <tr>
                                    <td>
                                        <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="background-color: #fff; text-align: center; max-width:650px;">

                                            <tr>
                                                <td align="center" height="40" bgcolor="#fff" style="background-color: #fff; vertical-align: middle; text-align: center; ">
                                                    <a href="https://www.company.com/" target="_blank"><img align="center" style="" src="https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon@2.png?v=73d79a89bded" style="display:block" alt="" /></a>
                                                </td>
                                            </tr>

                                        </table>
                                    </td>
                                </tr>
                            </table>
                            <!-- END LOGO-->

                           <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; max-width:900px; background-color: #fff;">

                                <tr>
                                    <td>

                                        <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%;  background-color: #fff; max-width:800px; padding-left:10px; padding-right:10px;">     
                                            <tr>
                                                <td height="30" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 

                                            <tr>
                                                <td>
                                                    <p style="text-align: center; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:22px; color:#41424e; line-height: 1.4"><b>Example Template</b></p>
                                                </td>
                                            </tr>

                                            <tr>
                                                <td height="34" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 

                                            <tr>
                                                <td>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Dear customername,</p>
                                                </td>
                                            </tr>
                                             <tr>
                                                <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 

                                            <tr>
                                                <td>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Your productname - 1PC has been successfully renewed.</p>
                                                </td>
                                            </tr>

                                            <tr>
                                                <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                            </tr>

                                            <tr>
                                                <td>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#767683; line-height: 1.4">Details of your sub below.</p>
                                                </td>
                                            </tr>

                                            <tr>
                                                <td height="40" style="font-size: 1px;"><!-- cell --></td>
                                            </tr>
                                        </table>
                                    </td>  
                                </tr>         
                            </table>  

                            <!-- sum-->
                            <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#F2F2F6" style="width:100%;  background-color: #F2F2F6; max-width:900px;">     
                                <tr>
                                    <td>
                                        <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#F2F2F6" style="width:100%;  background-color: #F2F2F6; max-width:800px; padding-left:10px; padding-right:10px;">     
                                            <tr>
                                                <td height="34" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 

                                                <tr>
                                                    <td>
                                                        <p style="text-align: center; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:20px; color:#41424e; line-height: 1.4"><b>Your Auto-Renewal Sub</b></p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 

                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Product</b></p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">Productname - 1 PC</p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Order ID</b></p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">12131415161</p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Exp Prices charged</b></p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                            <!--START Price Exp, templates could be in numerous different languages but info like i.e. customername, productname, Order ID, Tracking IDs will always use the same format. --> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">$69.99 (a tax message)</p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">123 kr (b tax message)</p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">999 Pesos (c tax message)</p>
                                                    </td>
                                                </tr>
                                    <!--END Price Exps -->
                                               <tr>
                                                    <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Automatically renewed</b></p>
                                                    </td>
                                                </tr>
                                                <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">May 20, 2018</p>
                                                    </td>
                                                </tr>
                                            <tr>
                                                <td height="42" style="font-size: 1px;"><!-- cell --></td>
                                            </tr>
                                        </table>   
                                    </td>
                                </tr>
                            </table>
                            <!--END sum -->  

                            <!-- white-->
                            <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%; max-width:900px; background-color: #fff;">
                                <tr>
                                    <td height="15" style="font-size: 1px;"><!-- cell --></td>
                                </tr>

                                <tr>
                                    <td>

                                        <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#fff" style="width:100%;  background-color: #fff; max-width:800px; padding-left:10px; padding-right:10px;">     
                                            <tr>
                                                <td height="30" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 
                                            <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">If you’d like to check your order status, please sign in to <a href="https://www.company.com/en-us/order?pgm=6916670010" target="_blank">company.com/orders</a> with the login credentials below.</p>
                                                    </td>
                                                </tr>
                                               <tr>
                                                    <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Order ID:</b> 12131415161</p>
                                                    </td>
                                                </tr>
                                            <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Password:</b> stAcKoverFlOwrocks</p>
                                                    </td>
                                                </tr>
                                            <tr>
                                                    <td height="20" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#42ba8f; line-height: 1.4"><b>Your Plan</b></p>
                                                    </td>
                                                </tr>
                                            <tr>
                                                    <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                                <tr>
                                                    <td>
                                                        <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><strong>Auto-Renewal Terms</strong><p>By completing your purchase, you have authorized us to do a bunch of legal stuff.</p>
                                                    </td>
                                                </tr>
    <tr>
                                                    <td height="30" style="font-size: 1px;"><!-- cell --></td>
                                                </tr> 
                                            <tr>
                                                <td>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><b>Need help?</b></p>
                                                </td>
                                            </tr>
                                            <tr>
                                                <td height="1" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 
    <tr>                                                                                      
                                                <td  style="vertical-align: middle;">                                               
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4"><a href="https://company.com/en_US/support" target="_blank">company.com/help</a></p>
                                                </td>
                                            </tr>
                                            <tr>
                                                <td height="30" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 
                                            <tr>
                                                <td>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:16px; color:#41424e; line-height: 1.4">Thanks for trusting us.</p>
                                                </td>
                                            </tr>


                                            <tr>
                                                <td height="34" style="font-size: 1px;"><!-- cell --></td>
                                            </tr> 
                                        </table>
                                    </td>
                                </tr>
                            </table>
                            <!-- END white -->
                            <!--FOOTER-->
                            <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#777684" style="width:100%;  background-color: #E7E7EF; max-width:900px;">     
                                <tr>
                                    <td height="30" style="font-size: 1px;"><!-- cell --></td>
                                </tr> 
                                <tr>
                                    <td>
                                        <table align="center" cellspacing="0" cellpadding="0" border="0" bgcolor="#777684" style="width:100%;  background-color: #E7E7EF; max-width:900px;">
                                <tr>
                                   <td>
                                        <table><tr><td>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:14px; color:#767683; line-height: 1.4">Trouble installing? <u><a href="https://www.company.com/en-us/faq.php" target="_blank">Visit FAQ</a></u></p>
                                                    <p style="text-align: left; margin:0; padding: 0; font-family: Arial, Helvetica, sans-serif; font-size:14px; color:#767683; line-height: 1.4">Curious for more? <u><a href="https://www.company.com/en-us" target="_blank">Find more</a></u></p>
                                                </td></tr></table>
                                    </td>
                                    <td>
                                        <table align="right" >
                                            </tr> </table> 
                                    </td>
                                </tr>
                                <tr>
                                    <td height="30" style="font-size: 1px;"><!-- cell --></td>
                                </tr>
                                        </table>
                                    </td>
                                </tr>
                            </table>           
                            <!--END FOOTER -->  
                        </td>
                    </tr>
                </tbody> 
            </table>
        </body>
    </html>

REGEX

(NT\$|SAR)\s(\d{2,5})|\d{1,4}([.,]\d{3})*([\s.,]\d{2,3}|[^\W]\d+(\d{1,4})*\s(kr|zł|Pesos|Kč|Ft|บาท|SAR|₪))

Matches all

  • 1.4">$19.99 (some random text)
  • 1.4">R$20.00
  • 1.4">20.00€
  • 1.4">€25,99
  • 1.4">£15.99
  • 1.4">123 kr
  • 1.4">1234 Ft
  • 1.4">999 Pesos

Output

  • 19.99
  • 20.00
  • 20.00
  • 25.99
  • 15.99
  • 123 kr
  • 1234 Ft
  • 999 Pesos

The last three examples should not have a space and/or any letters following the numbers.

How can I remove them from the output but retain the numbers?


I realize this is likely due to the multiple capture groups I have, so I see three potential solutions:

  1. Refine the Regex to eliminate the overuse of groups (I am not skilled enough to figure this out)...
  2. Somehow write Regex non-capturing groups that result in the desired output. I have sadly learned that (?:) simply doesn't work.
  3. Utilize the SQL function parameters to select subexpressions from REGEXP_SUBSTR. However, this doesn't appear to allow more than one subexpression in the output.

SQL

SELECT
REPLACE(REPLACE(REGEXP_SUBSTR(nnc.MESSAGE, '(NT\$|SAR)\s(\d{2,5})|\d{1,4}([.,]\d{3})*([\s.,]\d{2,3}|[^\W]\d+(\d{1,4})*\s(kr|zł|Pesos|Kč|Ft|บาท|SAR|₪))'),',','.'),' ','') AS EMAIL_PRICE_SENT
FROM tablename
WHERE clause;

That is the full statement with a couple nested REPLACE functions to format the output to the system format.

I know this is the wrong language so it will not provide a 100% accurate test but I find it very helpful before running it against the DB. I am always open for better tool suggestions!

I have spent too much time than I am proud of on this, so any assistance would be much appreciated.

Thanks, Nick

Community
  • 1
  • 1
  • 1
    what was the actual question? – Tibrogargan Apr 22 '17 at 02:12
  • Have you tried using simply regexp_replace? Are you able to show us an example of what is stored in nnc.MESSAGE (obviously clean up any private data)? – grail Apr 22 '17 at 03:09
  • I appreciate the effort, but you didn't actually ask any question here. This is just a big collection of info about regexes. A tip for the future: if you didn't include a '?' at the end of a sentence in your post, you should continue editing. – Peter G Apr 22 '17 at 06:09
  • Wow, guess I was awake too late...sorry for that, **fixed**. @grail I don't think REGEXP_REPLACE would help here as I am able to match all prices and return most scenarios in the correct format. I can try to post an example of the content but I will actually be away from a keyboard for two weeks starting tonight. – cleanSteve Apr 22 '17 at 07:19
  • A couple of examples would really help as your regex is quite confusing and also would not perform as expected, case in point \d+(\d{1,4})*, as regex is greedy, the plus on the first \d would chew up all numbers so that the grouped one would never have anything in it due to the zero or more constraint after. At best if the group were to force at least one number that would be the max it would ever get due to previous constraint. – grail Apr 22 '17 at 07:31
  • You can't do that with a single regex in Oracle. You have several alternatives and you need specific group contents. – Wiktor Stribiżew Apr 22 '17 at 08:38
  • @grail I have added example HTML datasource code to the question. I created the Regex through trial and error so mistakes were unfortunately, expected. Hopefully the example is helpful. – cleanSteve Apr 22 '17 at 16:52
  • @Wiktor would you be able to expand on those alternatives? – cleanSteve Apr 22 '17 at 16:52
  • Looking at the current example, do all examples contain the words 'tax message' in the lines with the numbers we want? If so, this can be a lot simpler. – grail Apr 22 '17 at 18:34
  • Unfortunately no, that was just a placeholder. For example for en_US that message says (plus applicable taxes) but some notification events that are also being queried don't contain this tax message at all. It would have been great if it did... – cleanSteve Apr 22 '17 at 20:45

0 Answers0