5

I have this string:

var htmlString;

Assigned to:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >
<html>
<head>
<title>Payment Receipt</title>
<link rel="stylesheet" type="text/css" href="content/PaymentForm.css">
<style type="text/css">
</style>
<meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/>
</head>
<body>
<div id="divPageOuter" class="PageOuter">
    <div id="divPage" class="Page">
        <!--[1]-->
        <div id="divThankYou">
             Thank you for your order!
        </div>
        <hr class="HrTop">
        <div id="divReceiptMsg">
             You may print this receipt page for your records.
        </div>
        <div class="SectionBar">
             Order Information
        </div>
        <table id="tablePaymentDetails1Rcpt">
        <tr>
            <td class="LabelColInfo1R">
                 Merchant:
            </td>
            <td class="DataColInfo1R">
                <!--Merchant.val-->
                Ryan
                <!--end-->
            </td>
        </tr>
        <tr>
            <td class="LabelColInfo1R">
                 Description:
            </td>
            <td class="DataColInfo1R">
                <!--x_description.val-->
                Rasmussenpayment
                <!--end-->
            </td>
        </tr>
        </table>
        <table id="tablePaymentDetails2Rcpt" cellspacing="0" cellpadding="0">
        <tr>
            <td id="tdPaymentDetails2Rcpt1">
                <table>
                <tr>
                    <td class="LabelColInfo1R">
                         Date/Time:
                    </td>
                    <td class="DataColInfo1R">
                        <!--Date/Time.val-->
                        09-Jul-2012 12:26:46 PM PT
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo1R">
                         Customer ID:
                    </td>
                    <td class="DataColInfo1R">
                        <!--x_cust_id.val-->
                        <!--end-->
                    </td>
                </tr>
                </table>
            </td>
            <td id="tdPaymentDetails2Rcpt2">
                <table>
                <tr>
                    <td class="LabelColInfo1R">
                         Invoice Number:
                    </td>
                    <td class="DataColInfo1R">
                        <!--x_invoice_num.val-->
                        176966244
                        <!--end-->
                    </td>
                </tr>
                </table>
            </td>
        </tr>
        </table>
        <hr id="hrBillingShippingBefore">
        <table id="tableBillingShipping">
        <tr>
            <td id="tdBillingInformation">
                <div class="Label">
                     Billing Information
                </div>
                <div id="divBillingInformation">
                     Test14 Rasmussen<br>
                    1234 test st<br>
                    San Diego, CA 92107 <br>
                </div>
            </td>
            <td id="tdShippingInformation">
                <div class="Label">
                     Shipping Information
                </div>
                <div id="divShippingInformation">
                </div>
            </td>
        </tr>
        </table>
        <hr id="hrBillingShippingAfter">
        <div id="divOrderDetailsBottomR">
            <table id="tableOrderDetailsBottom">
            <tr>
                <td class="LabelColTotal">
                     Total:
                </td>
                <td class="DescrColTotal">
                     &nbsp;
                </td>
                <td class="DataColTotal">
                    <!--x_amount.val-->
                    US&nbsp;$250.00
                    <!--end-->
                </td>
            </tr>
            </table>
            <!-- tableOrderDetailsBottom -->
        </div>
        <div id="divOrderDetailsBottomSpacerR">
        </div>
        <div class="SectionBar">
             Visa ****0027
        </div>
        <table class="PaymentSectionTable" cellspacing="0" cellpadding="0">
        <tr>
            <td class="PaymentSection1">
                <table>
                <tr>
                    <td class="LabelColInfo2R">
                         Date/Time:
                    </td>
                    <td class="DataColInfo2R">
                        <!--Date/Time.1.val-->
                        09-Jul-2012 12:26:46 PM PT
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo2R">
                         Transaction ID:
                    </td>
                    <td class="DataColInfo2R">
                        <!--Transaction ID.1.val-->
                        2173493354
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo2R">
                        Authorization Code:
                    </td>
                    <td class="DataColInfo2R">
                        <!--x_auth_code.1.val-->
                        07I3DH
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo2R">
                         Payment Method:
                    </td>
                    <td class="DataColInfo2R">
                        <!--x_method.1.val-->
                        Visa ****0027
                        <!--end-->
                    </td>
                </tr>
                </table>
            </td>
            <td class="PaymentSection2">
                <table>
                </table>
            </td>
        </tr>
        </table>
        <div class="PaymentSectionSpacer">
        </div>
    </div>
    <!-- entire BODY -->
</div>
<div class="PageAfter">
</div>
</body>
</html>

And I want to find the location of "x_auth_code.1.val" in the string. And then I want to obtain a string from the location plus a certain number of characters. The goal would be to return the Authorization code.

jscs
  • 63,694
  • 13
  • 151
  • 195
user1491739
  • 1,017
  • 2
  • 11
  • 16

2 Answers2

17

You can use indexOfSlice, and then slice() in StringOps

scala> val myString = "Hello World!"
myString: java.lang.String = Hello World!

scala> val index = myString.indexOfSlice("Wo")
index: Int = 6

scala> val slice = myString.slice(index, index+5)
slice: String = World

With your html string:

scala> htmlString.indexOfSlice("x_auth_code.1.val")
res4: Int = 2771
Christopher Chiche
  • 15,075
  • 9
  • 59
  • 98
  • The indexOfSlice is a good addition to one's tool suite due to its worst case BigO runtime complexity – Paul Aug 15 '19 at 15:32
0

Why aren't you using an XML parser? Don't treat XML as strings -- you'll get bitten if you do.

Here's a regex to do it, but my advice is: DO NOT USE IT! Use xml tools.

"""\Qx_auth_code.1.val\E[^>]*>([^<]*)""".r.findFirstMatchIn(htmlString).map(_ group 1)
Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • 5
    He's not using an XML parser, because he isn't dealing with XML. It's HTML 4.0 transitional. Of course there are parsers for that, but it might still be malformed. So if you want to recommend a parser, maybe something like jsoup would be good. But if he really just needs to extract one String, that is overkill. – Kim Stebel Jul 09 '12 at 20:29
  • @KimStebel There are plenty tools that handle such. For example, [Tag Soup](http://home.ccil.org/~cowan/XML/tagsoup/) and [JTidy](http://jtidy.sourceforge.net/). – Daniel C. Sobral Jul 09 '12 at 20:32
  • Yes, and all of them are probably overkill. – Kim Stebel Jul 09 '12 at 20:46
  • @KimStebel TagSoup is a lightweight SAX parser, so it's *less* overkill than the standard SAX parser. – Daniel C. Sobral Jul 09 '12 at 20:59
  • By overkill I meant it's one more dependency and probably doesn't simplify his code. – Kim Stebel Jul 09 '12 at 21:42