Questions tagged [xidel]

Xidel is a command line tool to download and extract data from HTML/XML pages as well as JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern templates. It can also edit or create new XML/HTML/JSON documents.

Xidel supports:

Extract expressions

  • CSS 3 Selectors: to extract simple elements
  • XPath 3.0: to extract values and calculate things with them
  • XQuery 3.0: to create new documents from the extracted values
  • JSONiq: to work with JSON apis
  • Templates: to extract several expressions in an easy way using an annotated version of the page for pattern-matching
  • XPath 2.0/XQuery 1.0: compatibility mode for the old XPath/XQuery version

Following

  • HTTP Codes: Redirections like 30x are automatically followed, while keeping things like cookies
  • Links: It can follow all links on a page as well as some extracted values
  • Forms: It can fill in arbitrary data and submit the form

Output formats

  • Adhoc: just prints the data in a human readable format
  • XML: encodes the data as XML
  • HTML: encodes the data as HTML
  • JSON: encodes the data as JSON
  • bash/cmd: exports the data as shell variables

Connections

  • HTTP / HTTPS, as well as local files and stdin

Systems

  • Windows (using wininet), Linux (using synapse+openssl), Mac (synapse)
81 questions
4
votes
0 answers

Xidel utility alternative for ARM / Raspberry Pi?

Does any of you know of a utility similar to Xidel that works on ARM processors, specifically on a Raspberry Pi 2 Model B? I created a few Bash scripts on my x86_64 laptop that I am going to put on an always-on RPI when it will arrive, but I just…
3
votes
1 answer

How to extract exact values from a json file with xidel?

Excuse my English, I am not a native speaker I'm new to this so I don't know much I am trying to extract some values from a json file with xidel with the following command in windows cmd but it's not working xidel MyFile.json -e…
andy784
  • 55
  • 4
3
votes
1 answer

Why is my XPath with regex failing to match?

I would like to use Xidel to select a
tag with class="body" if contains a date in format YYYY.M(M).D(D) to find and extract one specific string which has 8 characters and can contain characters and digits. Sample input HTML:
Adrian
  • 2,576
  • 9
  • 49
  • 97
3
votes
1 answer

How to join two extracted values with xidel?

I use the following to extract two values using xidel -e. '//input[@name="qid"]/@value[1]' "//span[@id='trueFinalResultCount']" But I'd like to put the two results into a TSV format. result1result2 Could anybody show me how to combine the…
user1424739
  • 11,937
  • 17
  • 63
  • 152
3
votes
2 answers

Creating an alias for xpath expression in xidel with regex and bash

If you have already used Xidel, you will often need to locate nodes that have a certain class. To do this more easy, I want to create has-class("class") function that serves as an alias for the expression: contains(concat(" ",…
Rodrigo Vieira
  • 312
  • 1
  • 4
  • 19
3
votes
2 answers

Can we use Xidel for extracting data across a site into search files?

Background: We are aggregating content from some websites (with permission) for use in supplementary search functions for another application. An example is the news section of https://centenary.bahai.us. We thought to use xidel for this purpose,…
David Hunt
  • 31
  • 4
3
votes
2 answers

Xidel json xpath - how get value of multiple elements

Need to get multiple elements value from json data using Xidel. Single element query like: xidel - -e 'jn:members(json($raw))("client_name")' and xidel - -e 'jn:members(json($raw))("amount")' work fine but googling for long time, unable to find…
user2956477
  • 1,208
  • 9
  • 17
2
votes
3 answers

Is there a way to split a string in fixed width chunks in XPath?

Using xidel I'm extracting the //Assertion//Signature//KeyInfo//X509Certificate/text() from a SAMLResponse, this is a X509 certificate as a long base64 string. I want to split this string into 64 chars blocks I tried with tokenize() and replace()…
RubenLaguna
  • 21,435
  • 13
  • 113
  • 151
2
votes
1 answer

xidel: is it possible to retrieve specific nested values from JSON object?

Consider this json, from this question { "apiVersion": "apps/v1", "kind": "Deployment", "metadata": { "annotations": { "deployment.kubernetes.io/revision": "1" }, "creationTimestamp":…
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
2
votes
4 answers

Bash script to download latest release from GitHub

Looking for a simple way to download a .zip from a latest GitHub release. There are other similar questions, but I havent been able to get them to work. :( Trying to pull latest release from https://github.com/CTCaer/hekate Currently ive…
Fraxalotl
  • 21
  • 2
2
votes
1 answer

How to read correctly this JSON file with Xidel?

Excuse my English, I am not a native speaker I have a json file "VideoJson.json" which contains the following VideoJSONLoaded({"video_type": "0","image_id": "0","profile": false,"published_urls": [{"embed_url":…
andy784
  • 55
  • 4
2
votes
2 answers

Xidel extract number/float

I would like to extract number/float value from this code using Xidel:

304.00

Adrian
  • 2,576
  • 9
  • 49
  • 97
2
votes
1 answer

Xidel get json from HTML tag attribute

I am trying to extract an image URL from a div, where the link to the file is stored as a json object in data-settings attribute:
Adrian
  • 2,576
  • 9
  • 49
  • 97
2
votes
2 answers

Retrieve value from object in Javascript in XPATH

I need to extract information from HTML files. For most of them, I just need to match a particular DOM element's content or attribute, so I use XPATH expressions like //a[@class="targeturl"]/@href and the command line tool xidel. In a different…
dmcontador
  • 660
  • 1
  • 8
  • 18
2
votes
1 answer

Convert Output of Xidel Pattern Match into Json Objects

Curious whether I can use Xidel pattern from a file to convert an html list into an array of json objects. Given this example HTML:
Mr. Curious
  • 837
  • 9
  • 14
1
2 3 4 5 6