1

This question is very similar to this question but relates to docx4j instead of flying saucer.

I'm using docx4j to render an xhtml document to docx through a servlet which returns the generated docx document. The xhtml document features an image which is requested from another servlet. The image servlet checks who is logged in before returning the appropriate image. The code below shows how the image is requested:

<img height="140" width="140" src="http://localhost:8080/myapp/servlet/DisplayPic" />

My problem is that the http request for the image is from the XHTMLImporter (I think) and not the logged in user so the image servlet doesn't know who's logged in and therefore the desired image is not returned.

I'm currently using the code below to render the xhtml document:

XHTMLImporter.setHyperlinkStyle("Hyperlink");
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();

NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
ndp.unmarshalDefaultNumbering();

wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(xhtmlDocAsString, null, wordMLPackage));

In flying saucer I was able to use a ReplacedElementFactory but that doesn't seem to be something docx4j uses. Is there a way to replace elements during the conversion process?

Community
  • 1
  • 1
Edd
  • 8,402
  • 14
  • 47
  • 73
  • I've had a go at using base 64 encoded images embedded in the html as I could do html replacement before the conversion but docx4j doesn't seem to work with base 64 images – Edd Jul 06 '12 at 13:41
  • I've managed to do some filthy reflection to extend the Docx4jReplacedElementFactory and get the XHTMLImporter to use my ReplacedElementFactory but it doesn't work. I think images aren't included via the ReplacedElementFactory but are added at a latter stage in the conversion – Edd Jul 06 '12 at 15:16
  • import of base64 encoded image should work (see XHTMLImporter at line 976) – JasonPlutext Jul 09 '12 at 12:52
  • There doesn't seem to be anything which relates to base 64 there... I'm on version 2.8.0 – Edd Jul 09 '12 at 16:43
  • Had a look on [github](https://github.com/plutext/docx4j/blob/master/pom.xml) and could see the base64 stuff there but the version is 2.8.1-SNAPSHOT and I'd rather hold off using a snapshot version unless necessary. Please add your suggestion as an answer though as it does seem to be the more ideal solution – Edd Jul 11 '12 at 16:54

1 Answers1

2

Oh what fun I've had! I have a convoluted, complex and crazy solution and I know @JasonPlutext will provide a very simple and obvious solution that I overlooked.

Here it is. This code generates the word document to an output stream:

        outputStream = response.getOutputStream();

        XHTMLImporter.setHyperlinkStyle("Hyperlink");

        // Create an empty docx package
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();

        NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
        wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
        ndp.unmarshalDefaultNumbering();

        // Convert the XHTML, and add it into the empty docx we made
        List<Object> wmlObjects = getWmlObjects(wordMLPackage, xhtmlDocumentAsString);
        wordMLPackage.getMainDocumentPart().getContent().addAll(wmlObjects);

        SaveToZipFile saver = new SaveToZipFile(wordMLPackage);
        saver.save(outputStream);

The method getWmlObjects is my own which simulates the XHTMLImporter.convert method but does everything itself with a lot of reflection. It basically injects a couple of objects to override the default Docx4jUserAgent and Docx4jReplacedElementFactory objects in the DocxRenderer (which is a field of the Importer instance). See below:

private List<Object> getWmlObjects(WordprocessingMLPackage wordMLPackage, String xhtmlDocumentAsString) {

    try {
        DocxRenderer renderer = new DocxRenderer();

        // override the user agent
        FieldAccessUtils.setField(renderer, "userAgent", new ProfileImageDocx4jUserAgent());

        // override the replaced element factory
        Docx4jDocxOutputDevice outputDevice = (Docx4jDocxOutputDevice) FieldAccessUtils.getField(renderer,
                "_outputDevice");
        renderer.getSharedContext().setReplacedElementFactory(
                new ProfileImageDocx4jReplacedElementFactory(outputDevice));

        // build the XHTMLImporter instance as it does in XHTMLImporter.convert but with our new renderer

        XHTMLImporter importer; // = new XHTMLImporter(wordMLPackage);
        Constructor<XHTMLImporter> constructor = XHTMLImporter.class
                .getDeclaredConstructor(WordprocessingMLPackage.class);
        constructor.setAccessible(true);
        importer = constructor.newInstance(wordMLPackage);
        constructor.setAccessible(false);

        FieldAccessUtils.setField(importer, "renderer", renderer);

        InputSource is = new InputSource(new BufferedReader(new StringReader(xhtmlDocumentAsString)));
        Document dom = XMLResource.load(is).getDocument();

        renderer.setDocument(dom, null);
        renderer.layout();

        // use reflection to do: importer.traverse(renderer.getRootBox(), FieldAccessUtils.getField(importer, "imports"), null);
        Method traverseMethod = importer.getClass().getDeclaredMethod("traverse", Box.class, List.class,
                TableProperties.class);
        traverseMethod.setAccessible(true);
        traverseMethod.invoke(importer, renderer.getRootBox(), FieldAccessUtils.getField(importer, "imports"), null);
        traverseMethod.setAccessible(false);

        return (List<Object>) FieldAccessUtils.getField(importer, "imports");

    } catch (SecurityException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    } catch (NoSuchMethodException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    } catch (IllegalArgumentException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    } catch (IllegalAccessException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    } catch (InvocationTargetException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    } catch (InstantiationException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    }

    try {
        // plan B
        return XHTMLImporter.convert(xhtmlDocumentAsString, null, wordMLPackage);
    } catch (Docx4JException e) {
        getLogger().error(ExceptionUtils.getStackTrace(e));
    }

    return null;
}

Then I just have my two customised classes ProfileImageDocx4jUserAgent (which does the donkey work):

public class ProfileImageDocx4jUserAgent extends Docx4jUserAgent {

    /**
     * Replace the image where the DisplayUserPic servlet is being called.
     * <p>
     * From overridden method javadoc:
     * <p>
     * {@inheritDoc}
     */
    @Override
    public Docx4JFSImage getDocx4JImageResource(String uri) {

        if (StringUtils.contains(uri, "DisplayUserPic")) {

            InputStream input = null;
            try {

                input = ...;
                byte[] bytes = IOUtils.toByteArray(input);
                return new Docx4JFSImage(bytes);

            } catch (IOException e) {
                getLogger().error(ExceptionUtils.getStackTrace(e));
            } catch (ServiceException e) {
                getLogger().error(ExceptionUtils.getStackTrace(e));
            } finally {
                IOUtils.closeQuietly(input);
            }

            return super.getDocx4JImageResource(uri);

        } else {
            return super.getDocx4JImageResource(uri);
        }
    }
}

And ProfileImageDocx4jReplacedElementFactory (which gets the iText stuff to ignore the image at this point... otherwise there's an error logged but it still works fine):

public class ProfileImageDocx4jReplacedElementFactory extends Docx4jReplacedElementFactory {

    /**
     * Constructor.
     * 
     * @param outputDevice
     *            the output device
     */
    public ProfileImageDocx4jReplacedElementFactory(Docx4jDocxOutputDevice outputDevice) {
        super(outputDevice);
    }

    /**
     * Forces any images which use the DisplayUserPic servlet to be ignored.
     * <p>
     * From overridden method javadoc:
     * <p>
     * {@inheritDoc}
     */
    @Override
    public ReplacedElement createReplacedElement(LayoutContext layoutContext, BlockBox blockBox,
            UserAgentCallback userAgentCallback, int cssWidth, int cssHeight) {

        Element element = blockBox.getElement();
        if (element == null) {
            return null;
        }

        String nodeName = element.getNodeName();
        String src = element.getAttribute("src");
        if ("img".equals(nodeName) && src.contains("DisplayUserPic")) {
            return null;
        }

        // default behaviour
        return super.createReplacedElement(layoutContext, blockBox, userAgentCallback, cssWidth, cssHeight);
    }
}

I guess the docx4j guys will probably build something into docx4j to handle this sort of case but for the moment (I think) this seems to be a good work around

Edd
  • 8,402
  • 14
  • 47
  • 73