83

Good day! Need to convert xml using xslt in Python. I have a sample code in php.

How to implement this in Python or where to find something similar? Thank you!

$xmlFileName = dirname(__FILE__)."example.fb2";
$xml = new DOMDocument();
$xml->load($xmlFileName);

$xslFileName = dirname(__FILE__)."example.xsl";
$xsl = new DOMDocument;
$xsl->load($xslFileName);

// Configure the transformer
$proc = new XSLTProcessor();
$proc->importStyleSheet($xsl); // attach the xsl rules
echo $proc->transformToXML($xml);
aphex
  • 1,147
  • 1
  • 12
  • 17

3 Answers3

137

Using lxml,

import lxml.etree as ET

dom = ET.parse(xml_filename)
xslt = ET.parse(xsl_filename)
transform = ET.XSLT(xslt)
newdom = transform(dom)
print(ET.tostring(newdom, pretty_print=True))
R Sahu
  • 204,454
  • 14
  • 159
  • 270
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 6
    Hello, I have one more question here. If the xml file is large, the efficiency of "newdom = transform(dom)" is very bad. I tried to parse a large xml file (>100MB), it will cost a long time (>4 hours) to transform. Do you have any good advice about transforming xml file with xslt using python lxml? – KevinLeng Mar 11 '14 at 09:01
  • 2
    Under the hood, lxml is using [libxslt](http://xmlsoft.org/XSLT/) to transform the XML. I don't know of anything you can do to speed this up. Maybe if you post your XSLT in a new question, someone will be able to suggest an improvement. – unutbu Mar 11 '14 at 09:28
  • Thanks. I have tried using write with no luck. How do you write contents to file? – programiss Sep 29 '15 at 10:05
  • @programiss What you mean with no luck? Etree.write(file, pretty_print) is the method to use. Maybe you passed wrong argument? That's common mistake. It's supposed to be writable object (eg. file or sys.stdout, etc), so not just filename string. – Stabby Oct 14 '15 at 11:16
  • `print(ET.tostring(newdom, pretty_print=True))` for some reason only showed me part of the expected output (compared with xsltproc). The solution was to use `print(str(newdom))`, as stated [here](https://lxml.de/apidoc/lxml.etree.html#lxml.etree.XSLT.tostring) – T-Dawg Jan 08 '22 at 20:53
  • 2
    With Python 3, one has to produce a unicode string when writing to stdout: `print(ET.tostring(newdom, pretty_print=True, encoding="unicode"))` – Patrick Ohly Jul 04 '22 at 08:16
  • XSLT is a stylesheet file I use it with xml in APEX to produce PDF reports. how can I do that in python please !.. Thank you in advance @unutbu – Saddam Meshaal Nov 17 '22 at 15:23
6

LXML is a widely used high performance library for XML processing in python based on libxml2 and libxslt - it includes facilities for XSLT as well.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
miku
  • 181,842
  • 47
  • 306
  • 310
4

The best way is to do it using lxml, but it only support XSLT 1

import os
import lxml.etree as ET

inputpath = "D:\\temp\\"
xsltfile = "D:\\temp\\test.xsl"
outpath = "D:\\output"


for dirpath, dirnames, filenames in os.walk(inputpath):
            for filename in filenames:
                if filename.endswith(('.xml', '.txt')):
                    dom = ET.parse(inputpath + filename)
                    xslt = ET.parse(xsltfile)
                    transform = ET.XSLT(xslt)
                    newdom = transform(dom)
                    infile = unicode((ET.tostring(newdom, pretty_print=True)))
                    outfile = open(outpath + "\\" + filename, 'a')
                    outfile.write(infile)

to use XSLT 2 you can check options from Use saxon with python

Maliq
  • 315
  • 1
  • 3
  • 11
  • for people using XSLT3.x this is the way! – YaP Dec 10 '22 at 22:55
  • Long story short: for XSLT 2 and up, use [SaxonC](https://stackoverflow.com/a/58448374/1016065). As @Maliq suggests, lxml does not support any higher than XSLT 1. In my case, the line `transform = ET.XSLT(xslt)` errors because my `xsltfile` has ` – RolfBly May 01 '23 at 15:32