Oh what fun I've had! I have a convoluted, complex and crazy solution and I know @JasonPlutext will provide a very simple and obvious solution that I overlooked.
Here it is. This code generates the word document to an output stream:
outputStream = response.getOutputStream();
XHTMLImporter.setHyperlinkStyle("Hyperlink");
// Create an empty docx package
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
ndp.unmarshalDefaultNumbering();
// Convert the XHTML, and add it into the empty docx we made
List<Object> wmlObjects = getWmlObjects(wordMLPackage, xhtmlDocumentAsString);
wordMLPackage.getMainDocumentPart().getContent().addAll(wmlObjects);
SaveToZipFile saver = new SaveToZipFile(wordMLPackage);
saver.save(outputStream);
The method getWmlObjects
is my own which simulates the XHTMLImporter.convert
method but does everything itself with a lot of reflection. It basically injects a couple of objects to override the default Docx4jUserAgent
and Docx4jReplacedElementFactory
objects in the DocxRenderer
(which is a field of the Importer instance). See below:
private List<Object> getWmlObjects(WordprocessingMLPackage wordMLPackage, String xhtmlDocumentAsString) {
try {
DocxRenderer renderer = new DocxRenderer();
// override the user agent
FieldAccessUtils.setField(renderer, "userAgent", new ProfileImageDocx4jUserAgent());
// override the replaced element factory
Docx4jDocxOutputDevice outputDevice = (Docx4jDocxOutputDevice) FieldAccessUtils.getField(renderer,
"_outputDevice");
renderer.getSharedContext().setReplacedElementFactory(
new ProfileImageDocx4jReplacedElementFactory(outputDevice));
// build the XHTMLImporter instance as it does in XHTMLImporter.convert but with our new renderer
XHTMLImporter importer; // = new XHTMLImporter(wordMLPackage);
Constructor<XHTMLImporter> constructor = XHTMLImporter.class
.getDeclaredConstructor(WordprocessingMLPackage.class);
constructor.setAccessible(true);
importer = constructor.newInstance(wordMLPackage);
constructor.setAccessible(false);
FieldAccessUtils.setField(importer, "renderer", renderer);
InputSource is = new InputSource(new BufferedReader(new StringReader(xhtmlDocumentAsString)));
Document dom = XMLResource.load(is).getDocument();
renderer.setDocument(dom, null);
renderer.layout();
// use reflection to do: importer.traverse(renderer.getRootBox(), FieldAccessUtils.getField(importer, "imports"), null);
Method traverseMethod = importer.getClass().getDeclaredMethod("traverse", Box.class, List.class,
TableProperties.class);
traverseMethod.setAccessible(true);
traverseMethod.invoke(importer, renderer.getRootBox(), FieldAccessUtils.getField(importer, "imports"), null);
traverseMethod.setAccessible(false);
return (List<Object>) FieldAccessUtils.getField(importer, "imports");
} catch (SecurityException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} catch (NoSuchMethodException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} catch (IllegalArgumentException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} catch (IllegalAccessException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} catch (InvocationTargetException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} catch (InstantiationException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
}
try {
// plan B
return XHTMLImporter.convert(xhtmlDocumentAsString, null, wordMLPackage);
} catch (Docx4JException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
}
return null;
}
Then I just have my two customised classes ProfileImageDocx4jUserAgent (which does the donkey work):
public class ProfileImageDocx4jUserAgent extends Docx4jUserAgent {
/**
* Replace the image where the DisplayUserPic servlet is being called.
* <p>
* From overridden method javadoc:
* <p>
* {@inheritDoc}
*/
@Override
public Docx4JFSImage getDocx4JImageResource(String uri) {
if (StringUtils.contains(uri, "DisplayUserPic")) {
InputStream input = null;
try {
input = ...;
byte[] bytes = IOUtils.toByteArray(input);
return new Docx4JFSImage(bytes);
} catch (IOException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} catch (ServiceException e) {
getLogger().error(ExceptionUtils.getStackTrace(e));
} finally {
IOUtils.closeQuietly(input);
}
return super.getDocx4JImageResource(uri);
} else {
return super.getDocx4JImageResource(uri);
}
}
}
And ProfileImageDocx4jReplacedElementFactory (which gets the iText stuff to ignore the image at this point... otherwise there's an error logged but it still works fine):
public class ProfileImageDocx4jReplacedElementFactory extends Docx4jReplacedElementFactory {
/**
* Constructor.
*
* @param outputDevice
* the output device
*/
public ProfileImageDocx4jReplacedElementFactory(Docx4jDocxOutputDevice outputDevice) {
super(outputDevice);
}
/**
* Forces any images which use the DisplayUserPic servlet to be ignored.
* <p>
* From overridden method javadoc:
* <p>
* {@inheritDoc}
*/
@Override
public ReplacedElement createReplacedElement(LayoutContext layoutContext, BlockBox blockBox,
UserAgentCallback userAgentCallback, int cssWidth, int cssHeight) {
Element element = blockBox.getElement();
if (element == null) {
return null;
}
String nodeName = element.getNodeName();
String src = element.getAttribute("src");
if ("img".equals(nodeName) && src.contains("DisplayUserPic")) {
return null;
}
// default behaviour
return super.createReplacedElement(layoutContext, blockBox, userAgentCallback, cssWidth, cssHeight);
}
}
I guess the docx4j guys will probably build something into docx4j to handle this sort of case but for the moment (I think) this seems to be a good work around