I'm trying to find all images (.png, .bmp, .jpg) and executables (.exe) from anchor links using lxml. From this similar thread, the accepted answer suggests doing something like this:
png = tree.xpath("//div/ul/li//a[ends-with(@href, '.png')]")
bmp = tree.xpath("//div/ul/li//a[ends-with(@href, '.bmp')]")
jpg = tree.xpath("//div/ul/li//a[ends-with(@href, '.jpg')]")
exe = tree.xpath("//div/ul/li//a[ends-with(@href, '.exe')]")
However, I get keep getting this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lxml.etree.pyx", line 2095, in lxml.etree._ElementTree.xpath (src/lxml/lxml.etree.c:53597)
File "xpath.pxi", line 373, in lxml.etree.XPathDocumentEvaluator.__call__ (src/lxml/lxml.etree.c:134052)
File "xpath.pxi", line 241, in lxml.etree._XPathEvaluatorBase._handle_result (src/lxml/lxml.etree.c:132625)
File "xpath.pxi", line 226, in lxml.etree._XPathEvaluatorBase._raise_eval_error (src/lxml/lxml.etree.c:132453)
lxml.etree.XPathEvalError: Unregistered function
I'm running lxml 3.2.4 through pip.
Also, instead of defining the xpath 4 times for each file extension, is there a way to use xpath and specify all four file extensions at once?