0

Im trying to perform a screen scrape because i can't find a relevant free API to get the data i need. I've managed to perform the scrape and grab the HTML page but the part i'm stuck on is getting the relevant information out of the grabbed content. I'm guessing i will need to use REG expressions to search through the HTML but unsure how to do this. the information I'm after is MAKE, MODEL, YEAR of the current car search.

 var url = NSURL(string: "https://www.rac.co.uk/buying-a-car/car-passport/report/buyer/purchase/?BuyerVrm=yg06dxt")

    if url != nil {
        let task = NSURLSession.sharedSession().dataTaskWithURL(url!, completionHandler: { (data, response, error) -> Void in
            print(data)

            if error == nil {

                var urlContent = NSString(data: data, encoding: NSASCIIStringEncoding) as NSString!

                print(urlContent)
            }
        })
        task.resume()
    }


}

heres a sample of the retuned information

<p class="CarMiniProfile-caveat u-hidden">*image for illustrative purposes only</p>

            <div>
                <table class="CarMiniProfile-table">
                    <tbody>
                        <tr class="CarMiniProfile-tableFirstRow">
                            <td class="CarMiniProfile-tableHeader">
                                Make
                            </td>
                            <td>
                                FIAT
                            </td>
                        </tr>
                        <tr>
                            <td class="CarMiniProfile-tableHeader">
                                Model
                            </td>
                            <td>
                                PUNTO SPORTING M-JET
                            </td>
                        </tr>
                        <tr>
                            <td class="CarMiniProfile-tableHeader">
                                Colour
                            </td>
                            <td>
                                BLUE
                            </td>
                        </tr>
                        <tr>
                            <td class="CarMiniProfile-tableHeader">
                                Year
                            </td>
                            <td>
                                2006
                            </td>
                        </tr>
                        <tr>
                            <td class="CarMiniProfile-tableHeader">
                                Engine Size
                            </td>
                            <td>

1910 cc

                            </td>
                        </tr>
                    </tbody>
                </table>
            </div>

            <h3 class="CarMiniProfile-subheading">Check this car in 3 simple steps...</h3>
Bobby
  • 11
  • 5

1 Answers1

0

Using regexes for html isn't a good idea, I agree. Sometimes I've had to do some real nasty stuff with regexes and html.

If you absolutely must do it this way then here's one for MAKE:

<td.*?CarMiniProfile-tableHeader.*?\n*(.*?)\n*<\/td>

You should be able to customise this for everything else you need. Using regexes is definitely not a recommended solution for this though.

user1532669
  • 2,288
  • 4
  • 36
  • 72
  • thanks for that. if not using regexes whats a better way to do this would you suggest? – Bobby Dec 03 '15 at 08:44
  • no problem. Maybe something like this would help you: http://search.cpan.org/~ether/WWW-Mechanize-1.75/lib/WWW/Mechanize.pm – user1532669 Dec 03 '15 at 19:59