Xidel is a command line tool to download and extract data from HTML/XML pages as well as JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern templates. It can also edit or create new XML/HTML/JSON documents.
Xidel supports:
Extract expressions
- CSS 3 Selectors: to extract simple elements
- XPath 3.0: to extract values and calculate things with them
- XQuery 3.0: to create new documents from the extracted values
- JSONiq: to work with JSON apis
- Templates: to extract several expressions in an easy way using an annotated version of the page for pattern-matching
- XPath 2.0/XQuery 1.0: compatibility mode for the old XPath/XQuery version
Following
- HTTP Codes: Redirections like 30x are automatically followed, while keeping things like cookies
- Links: It can follow all links on a page as well as some extracted values
- Forms: It can fill in arbitrary data and submit the form
Output formats
- Adhoc: just prints the data in a human readable format
- XML: encodes the data as XML
- HTML: encodes the data as HTML
- JSON: encodes the data as JSON
- bash/cmd: exports the data as shell variables
Connections
- HTTP / HTTPS, as well as local files and stdin
Systems
- Windows (using wininet), Linux (using synapse+openssl), Mac (synapse)