You have two ways to do it:
- If the remote site has CORS headers (Access-Control-Allow-Origin: *), you can inject it using an AJAX request; Please note that this will fail on browsers that do not support CORS
- You can parse it on the server
Option 2 is by far your preferred approach and relies on two libraries (three if you're like me): curl, which will handle the HTTP request, and DOMDocument, which handles the parsing.
I wrote a parser for someone a while back. You can find it there: https://stackoverflow.com/a/16144603/2167834 . It has a lot of detailed explanations on how to go through the DOM using DOMDocument
.
Please note that DOMDocument
is especially prone to breaking on the following:
- Incorrect charset definitions
- Broken HTML
- Inline JavaScript
You can, however, rewrite your source to deal with this.
In your case, once you have the DOMDocument and DOMXPath object, you want to query("//[@id=\"cmform\"]")
. The two forward slashes mean "any parent", [@id is an exact match on the parameter id.
Note that **this will fail if the DOM document has multiple elements with the same ID. They shouldn't have, by HTML spec.