The data you're looking for isn't really well-formed for automated extraction, but we can work with it. The getElementByVal()
function from this answer can be refactored to create getElementsByVal()
, which will return an array of all matching document elements, giving us something to do further searching in.
/**
* Traverse the given XmlElement, and return an array of matches.
* Note: 'class' is stripped during parsing and cannot be used for
* searching, I don't know why.
* <pre>
* Example: getElementsByVal( body, 'input', 'value', 'Go' ); will find
*
* <input type="submit" name="btn" value="Go" id="btn" class="submit buttonGradient" />
* </pre>
*
* @param {XmlElement} element XML document element to start search at.
* @param {String} id HTML <div> id to find.
*
* @return {[XmlElements]} All matching elements (in doc order).
*/
function getElementsByVal( element, elementType, attr, val ) {
var results = [];
// If the current element matches, remember it.
if (element[attr]
&& element[attr] == val
&& element.getName().getLocalName() == elementType) {
results.push( element );
}
// Check element's children
var elList = element.getElements();
var i = elList.length;
while (i--) {
// (Recursive) Check each child, in document order.
results = results.concat(getElementsByVal( elList[i], elementType, attr, val ));
}
// Return summary of matches
return results;
}
To make use of this new helper function, what if we create a function getIndicators()
that accepts a String parameter containing the point in time we're interested in - Mon. Jul 29, 2013 07:40:00
for example? The matching text will be found in an area
element that's got shape="rect"
, and we'll find it inside an attribute called onmousemove
. Here's our function:
function getIndicators(timeString) {
var txt = UrlFetchApp.fetch("http://www.barchart.com/chart.php?sym=DXU13&t=BAR&size=M&v=2&g=1&p=I:5&d=L&qb=1&style=technical&template=").getContentText();
var doc = Xml.parse(txt,true);
var body = doc.html.body;
var indicators = "not found";
// Look for elements matching: < area shape="rect" ... >
var chartPoints = getElementsByVal(body, 'area', 'shape', 'rect');
// Search for the chartPoint with tooltip containing the time we care about
for (var i=0; i<chartPoints.length; i++) {
if (chartPoints[i].onmousemove.indexOf(timeString) > -1) {
// found our match
indicators = chartPoints[i].onmousemove;
}
}
return indicators
}
As it stands, it will return the entire text value assigned to onmousemove
; the exercise of parsing that intelligently is left to you.
Here's a test function to help:
function test_getIndicators() {
Logger.log( getIndicators("Mon. Jul 29, 2013 07:40:00" ) );
}
When run, here's the log (today, anyway...):
[13-07-29 16:47:47:266 EDT] showOHLCTooltip(event, 'B', '[Mon. Jul 29, 2013 07:40:00]', 'DXU13', '81.8050000', '81.8300000', '81.8000000', '81.8200000')