I need to process XSLT using python, currently I'm using lxml which only support XSLT 1, now I need to process XSLT 2 is there any way to use saxon XSLT processor with python?
7 Answers
There are two possible approaches:
set up an HTTP service that accepts tranformation requests and implements them by invoking Saxon from Java; you can then send the transformation requests from Python over HTTP
use the Saxon/C product
, currently available on prerelease: details here: http://www.saxonica.com/saxon-c/index.xml

- 156,231
- 11
- 92
- 164
-
@Maliqf, which approach did you end up taking? and how was your experience with it – Vijay Kumar Dec 16 '15 at 17:41
-
3I wrap Saxon/C in a thin Boost-Python wrapper. It's not difficult to do providing you know a bit of C/C++ - it's just a bit of boilerplate on-top of the the C++ examples given on Saxon's website. You can use the supplied PHP API as a guide on how to structure your Python API. I did it for exactly the reasons stated, no XSLT 3 support native to Python. It works well for me - specifically it's fast, unlike forking a child saxon process or HTTP requests. – Phil Jul 12 '18 at 12:29
Saxon/C release 1.2.0 is now out with XSLT 3.0 support for Python3 see details:

- 691
- 6
- 10
-
4By now, this should be promoted to correct answer. Also cf. https://stackoverflow.com/questions/59059768/making-saxon-c-available-in-python for a step-by-step description. – Chiarcos Mar 14 '22 at 12:48
-
At the moment there is not, but you could use the subprocess module to use the Saxon processor:
import subprocess
subprocess.call(["saxon", "-o:output.xml", "-s:file.xml", "file.xslt"])

- 898
- 14
- 17
On January 13, 2023, Saxonica has released their own mantained pip package for Saxon 12:
Now all we need is:
pip install saxonche

- 462
- 1
- 6
- 20
If you're using Windows:
Download the zip file Saxon-HE 9.9 for Java from http://saxon.sourceforge.net/#F9.9HE and unzip the file to C:\saxon
Use this Python code:
import os
import subprocess
def file_path(relative_path):
folder = os.path.dirname(os.path.abspath(__file__))
path_parts = relative_path.split("/")
new_path = os.path.join(folder, *path_parts)
return new_path
def transform(xml_file, xsl_file, output_file):
"""all args take relative paths from Python script"""
input = file_path(xml_file)
output = file_path(output_file)
xslt = file_path(xsl_file)
subprocess.call(f"java -cp C:\saxon\saxon9he.jar net.sf.saxon.Transform -t -s:{input} -xsl:{xslt} -o:{output}")

- 2,397
- 24
- 39
This is in addition to the above answers suggesting subprocess
and saxonche
.
The example code in saxonche's pypi repository is slightly flawed in that there's essential indentation missing.
Also, I know it's just an example, but it would instantiate a new_xslt30_processor()
for each and every xml file you need to transform. That wouldn't be very efficient.
My use case is that I periodically get a bunch of xml files (MARC21) that I need to transform with one and the same xslt-sheet (XSLT 2.0). So assume that the xslt-sheet 'o2a.xml' produces the desired output when I run
transform -s:my.xml -xsl:o2a.xml -o:my_output.xml
So I wrote this:
from saxonche import PySaxonProcessor
from pathlib import Path
class Xslt_proc():
proc = PySaxonProcessor(license = False)
nuproc = proc.new_xslt30_processor()
xform = nuproc.compile_stylesheet(stylesheet_file='o2a.xsl')
def transform(processor, infile, sfx):
outfname = f'{Path(infile).stem}_{sfx}.xml'
doc = processor.proc.parse_xml(xml_file_name=infile)
out = processor.xform.transform_to_string(xdm_node=doc)
with open(outfname, 'w') as f:
f.write(out)
def main():
f_xml = 'some_xml_file.xml'
P = Xslt_proc()
transform(P, f_xml, '_done')
if __name__ == "__main__":
main()
I was curious which method would be faster, subprocess or the code above.
So I ran 20 iterations on 5 input files. First using a subprocess
call to transform.exe
. And again, 20 iterations on the same 5 input files, with my own module, like this:
from pathlib import Path
import saxonche_transform as st
flist = [f.name for f in Path('.').glob('*.xml')]
P = st.Xslt_proc()
for i in range(20):
for f in flist:
st.transform(P, f, '_python')
The latter was 100 times faster, 2.6 seconds against 258 seconds for the subprocess
test.
So thank you, Saxonica.

- 3,612
- 5
- 32
- 46