0

I'm trying to parse an RDF file based on the old Mozilla extension format to get the version, for use in a makefile, so looking for xmllint or similar aproaches. .

<?xml version="1.0" encoding="UTF-8"?>
<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:em="http://www.mozilla.org/2004/em-rdf#">
<Description about="urn:mozilla:install-manifest"
    em:id="some ID" em:version="1.0"
    em:name="somename"
    em:description="description"
    em:creator="author"
    em:iconURL="chrome://extension/skin/icon.png"
    em:unpack="false" em:type="2" />
<em:targetApplication name="Pale Moon">
    <Description
        em:id="{8de7fcbb-c55c-4fbe-bfc5-fc555c87dbc4}" 
        em:minVersion="28.0.0a1"
        em:maxVersion="29.*" />
</em:targetApplication>
</RDF>

How do I extract the em: elements that are nested within a Description tag (id and version mainly)? Or should I not use this altogether and use regex - but again how on the commandline? I've found references to xmllint and xmlstarlet here, but couldn't figure it out.

Edit, tried the following commandline with xmlstarlet, but it doesn't output anything.

xmlstarlet sel -N r=http://www.w3.org/1999/02/22-rdf-syntax-ns# -N em=http://www.mozilla.org/2004/em-rdf# -t -m "/r:RDF/r:Description/@em:version"  install.rdf
Tomalak
  • 332,285
  • 67
  • 532
  • 628
Rex
  • 801
  • 1
  • 17
  • 26
  • Are you looking to use pure bash/dos commands, or would you be interested in a programming language with modules that do this? This [xml-parsing library from python](https://docs.python.org/3/library/xml.etree.elementtree.html) is an easy way to get xml-parsing done. – Xinthral Jun 29 '20 at 12:27
  • 1
    *"but again how on the commandline"* you really need to define what output you expect, and what you have tried so far – Tomalak Jun 29 '20 at 12:34
  • Firstly, `em:*` are not elements, but attributes. – Alexander Petrov Jun 29 '20 at 13:22
  • Pure bash preferably, I'm on Linux. Trying to parse it to use in a makefile. I just want to extract the value from an attribute, for example em:version. – Rex Jun 29 '20 at 15:34
  • The duplicate I've linked to shows how to solve this with xmlstarlet, you should be able to figure it out from there. The full XPath for `em:version` would be `/r:RDF/r:Description/@em:version`, where the namespace URI for `r` refers to `http://www.w3.org/1999/02/22-rdf-syntax-ns#` and for `em` to `http://www.mozilla.org/2004/em-rdf#`. – Tomalak Jun 29 '20 at 15:44
  • Wait, duplicate where? – Rex Jun 29 '20 at 15:45
  • Reload the page, link is on top of your question – Tomalak Jun 29 '20 at 15:46
  • 1
    The answer gives some more details about "Implicit Declaration of Default Namespace" in xmlstarlet, this info is also applicable to your file here. – Tomalak Jun 29 '20 at 15:50
  • Tried a commandline but it doesn't output anything. Have updated the question. – Rex Jun 29 '20 at 16:31
  • 2
    Together with the knowledge that default namespaces are bound to `_` in xmlstarlet, and the fact that xmlstarlet apparently reads the namespace declarations from the XML and makes them available on its own for convenience (I didn't know until now) the command becomes very short: `xmlstarlet sel -t -v "/_:RDF/_:Description/@em:version" install.rdf` outputs `1.0` for me. – Tomalak Jun 29 '20 at 17:07
  • 1
    If you want to select more info, extend the `-t` (template). e.g. `-t -v "/some/xpath" -n -v "/some/other/xpath"` would print the values of two XPaths with a newline inbetween. Take a look at the docs, the tool is very flexible. http://xmlstar.sourceforge.net/doc/UG/ch04s01.html – Tomalak Jun 29 '20 at 17:15

0 Answers0