tl;dr;
There is much greater support for css selectors and for Element.querySelector
(allowing for greater flexibility in chaining querySelector(All)
calls. This enormously enhances the expressivity of the MSHTML
class, in terms of CSS selectors, and brings it on par with Selenium Basic
.
Motivation:
I have been wanting to write a list of supported selectors for some time, due to the lack of documentation on this in relation to VBA, and the trial and error nature of learning what does and doesn't work. This latest change has spurred me to do so, and include those libraries which currently support use of CSS selectors within them.
CAVEATS:
- This is not exhaustive; it is pretty comprehensive.
- Should you find any errors, particularly with respect to Selenium Basic, which I had to write from memory, please notify me and I will edit accordingly.
- The recent changes, represented by shaded cells within the summary table (JSFiddle)| marked with ✔* , within simplified table below, are as they pertain to my set-up, at this point in time. Your mileage may vary e.g. CSS selectors were not supported at all < IE8.
Before and After:
Traditionally, the expressivity of CSS selectors within VBA was as follows, with respect to the libraries supporting them:

Selenium implementing, by far, the most CSS selectors.
Current state:
The current state of implemented selectors I believe to be as follows (sorry for image quality, even when you click to enlarge table - please see JSFiddle for clearest table view):

I include this as a simplified HTML insert as well, so you can click on hyperlinks. Please click the Run code snippet below the code insert, then the Full page link. Apologies, the table is large and I haven't even covered all conceivable selectors - only the main ones I consider likely to be frequently of use. Inserting a fancy table threw me over the body character limit so here we are. For a fancy table please see this JSFiddle - the newly supported are shaded.
<!DOCTYPE html>
<html>
<head>
<title>VBA: Valid CSS Selectors 2021-05-30</title>
</head>
<body>
<h1>VBA: Valid CSS Selectors 2021-05-30</h1>
<table>
<tr>
<td colspan="2">
<a href="https://drafts.csswg.org/selectors-3/">Selectors Level 3 Specification</a>
</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Pattern</td>
<td>Represents</td>
<td>Description</td>
<td>Level</td>
<td>Microsoft HTML Object Library (MSHTML)</td>
<td>Microsoft Internet Explorer Controls (SHDocVw)</td>
<td>Selenium Type Library (Selenium)</td>
<td>Remarks</td>
</tr>
<tr>
<td>*</td>
<td>any element</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#universal-selector">Universal selector</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E</td>
<td>an element of type E</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#type-selectors">Type selector</a>
</td>
<td>1</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo]</td>
<td>an E element with a "foo" attribute</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo="bar"]</td>
<td>an E element whose "foo" attribute value is exactly equal to "bar"</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo~="bar"]</td>
<td>an E element whose "foo" attribute value is a list of whitespace-separated values, one of which is exactly equal to "bar"</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo^="bar"]</td>
<td>an E element whose "foo" attribute value begins exactly with the string "bar"</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>3</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo$="bar"]</td>
<td>an E element whose "foo" attribute value ends exactly with the string "bar"</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>3</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo*="bar"]</td>
<td>an E element whose "foo" attribute value contains the substring "bar"</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>3</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E[foo|="en"]</td>
<td>an E element whose "foo" attribute has a hyphen-separated list of values beginning (from the left) with "en"</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>2</td>
<td>x</td>
<td>x</td>
<td>x</td>
<td> </td>
</tr>
<tr>
<td>E[attr operator value i]</td>
<td>value compared case-insensitively (ASCII range).</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>4</td>
<td>x</td>
<td>x</td>
<td>?</td>
<td>
<a href="https://www.w3.org/TR/selectors-4/#attribute-case">i identifier</a>
</td>
</tr>
<tr>
<td>E[attr operator value s]</td>
<td>value compared case-sensitively (ASCII range).</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
</td>
<td>4</td>
<td>x</td>
<td>x</td>
<td>x</td>
<td>
<a href="https://www.w3.org/TR/selectors-4/#attribute-case">s identifier</a>
</td>
</tr>
<tr>
<td>E:root</td>
<td>an E element, root of the document</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td>HTML node only</td>
</tr>
<tr>
<td>E:nth-child(n)</td>
<td>an E element, the n-th child of its parent</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td>nth-child(odd) and (even) as well as nth-child(range) also supported</td>
</tr>
<tr>
<td>E:nth-last-child(n)</td>
<td>an E element, the n-th child of its parent, counting from the last one</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:nth-of-type(n)</td>
<td>an E element, the n-th sibling of its type</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:nth-last-of-type(n)</td>
<td>an E element, the n-th sibling of its type, counting from the last one</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:first-child</td>
<td>an E element, first child of its parent</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:last-child</td>
<td>an E element, last child of its parent</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:first-of-type</td>
<td>an E element, first sibling of its type</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:last-of-type</td>
<td>an E element, last sibling of its type</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:only-child</td>
<td>an E element, only child of its parent</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:only-of-type</td>
<td>an E element, only sibling of its type</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:empty</td>
<td>an E element that has no children (including text nodes)</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:link</td>
<td rowspan="2">an E element being the source anchor of a hyperlink of which the target is not yet visited (:link) or already visited (:visited)</td>
<td rowspan="2">
<a href="https://drafts.csswg.org/selectors-3/#link">The link pseudo-classes</a>
</td>
<td>1</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:visited</td>
<td>1</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E:not(s)</td>
<td>an E element that does not match simple selector s</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#negation">Negation pseudo-class</a>
</td>
<td>3</td>
<td>✔*</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E F</td>
<td>an F element descendant of an E element</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#descendant-combinators">Descendant combinator</a>
</td>
<td>1</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E > F</td>
<td>an F element child of an E element</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#child-combinators">Child combinator</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E + F</td>
<td>an F element immediately preceded by an E element</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#adjacent-sibling-combinators">Next-sibling combinator</a>
</td>
<td>2</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>E ~ F</td>
<td>an F element preceded by an E element</td>
<td>
<a href="https://drafts.csswg.org/selectors-3/#general-sibling-combinators">Subsequent-sibling combinator</a>
</td>
<td>3</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td>foo, bar</td>
<td>foo, bar will match both <foo> and <bar> elements.</td>
<td>
<a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Selector_list">Selector list</a>
</td>
<td>1</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td> </td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>element.querySelector</td>
<td>Expanded element.querySelector</td>
<td>
<a href="https://developer.mozilla.org/en-US/docs/Web/API/Element/querySelector">Element.querySelector</a>
</td>
<td>API</td>
<td>✔</td>
<td>✔</td>
<td>✔</td>
<td>Can now chain querySelector(All) calls on wider base node range</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Lib info:</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Microsoft HTML Object Library (MSHTML)</td>
<td>MS Internet Explorer Controls (SHDocVw)</td>
<td>Selenium Type Library (Chromedriver)</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Lib</td>
<td>mshtml.dll</td>
<td>ieframe.dll</td>
<td>selenium.dll</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>File Version</td>
<td>11.00.19041.985</td>
<td>11.0.19041.964</td>
<td>2.0.9.0</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Date</td>
<td>2021-05-12</td>
<td>2021-05-12</td>
<td>2016-03-02</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</table>
</body>
</html>
12 newly supported pseudo-classes and an expanded Element.querySelector:
If you run the above snippet, and view full page, you will see there are now, at least, 12 newly supported pseudo-classes supported, as well as mention of expanded Element.querySelector. Bam, kapow, ker-sploosh, shut the proverbial front door ... welcome to VBA CSS Canaan, Scraper's Shangri-la, Nerd Nirvana!
I think there may also have been interesting updates to ieframe.dll
; the focus here is on recent mshtml.dll
changes. You may wish to review the IE support under the Lifecyle announcements here and here, or search for Lifecycle FAQ - Internet Explorer and Microsoft Edge
.
As the benefit of the expanded Element.querySelector()
was not covered in the last Q&A, I will briefly mention it here. By expanded, I mean an increased number of elements which you can call querySelector
on, such that you can chain .querySelector()
i.e .querySelector(..).querySelector(..)
and .querySelector(..).querySelectorAll(..)
.
Previously, this was largely not possible. As exemplified by this question. Typically, the workaround was to chain traditional methods onto the returned node e.g.
html.querySelector("body").getElementsByTagName("li")
; this led to unsightly chaining and hard to follow, as well as limited, paths to target elements. Better, IMHO, was the idea of a surrogate MSHTML.HTMLDocument
variable, which would carry the innerHTML
of the current node returned by querySelector
, and thus allow you to call querySelector(All)
again; and thereby gain access to much faster matching, clearer syntax and greater versatility. Numerous examples of that approach here.
End Notes:
This is a document under revision. All feedback on improvements welcomed.
Thanks:
Finally, a big thanks to @SIM for running a test script of mine to examine this on a different set-up.