I wish to convert a .XPS File to Text with Java or Kotlin.
Aspose is to expensive for me.
I found Java-AXP, which should do the job but I did not find any documentation or sample code for it.
I managed to get the Java-AXP Core to my project libraries in IntelliJ. I can now access the files from within my project. But actually understanding how to use the library to get the text converstion is way beyond me.
The only attempt to use the library with example code I found here:
File b =new File("D:\\Chemia\\Clients\\Clients\\Docs\\Equipment\\CCP\\XPSOCRDemo.xps");
IXPSAccess access = new XPSFileAccessImpl(b);
IXPSFileAccess xpsFileAccess = access.getFileAccess();
XPSDocumentAccessImpl xpsimpl=new XPSDocumentAccessImpl(access);
int docunum = xpsimpl.getFirstDocNum();
IDocumentStructure structure=xpsimpl.getDocumentStructure(docunum);
List<IOutlineEntry> list=(List<IOutlineEntry>)
structure.getDocumentStructureOutline().getDocumentOutline().getOutlineEntry(
);
list.stream().forEach(restu->{
System.out.println( restu.getOutlineTarget());
});
However, I do get the same null pointer exeption as the OP and I can't fix it. I'd need a working example code to continue on my own.
So how can I use Java-AXP? Or are there alternative libraries? I am open to use anything.
Thanks for any help.
Edit:
@Abra Thanks.
Exception in thread "main" java.lang.IllegalStateException: structure must not be null
at MainKt.main(main.kt:80)
at MainKt.main(main.kt)
73 | val structure = xpsimpl.getDocumentStructure(0)
80 | val list = structure.documentStructureOutline?.documentOutline?.outlineEntry as List<IOutlineEntry>?
xpsimpl.getDocumentStructure(docunum) is null
I can not use a website for my purpose because the XPS files contain sensitive information.
Edit: Here is the solution I found.
I found this repo which uses Java-AXP. However there seemed to be some differences to the java-axp-core.jar I used from Google Code
So I copied the javaaxp folder to my project.
It needed Apache Tika so I downloaded tika-app-2.1.0.jar from Apache Tika and included it in my project via the Project Structure menu.
In XPSZipFileAccess.java I had a wrong import for IOUtils so I changed
import org.apache.tika.io.IOUtils;
to
import org.apache.commons.io.IOUtils;
Now the Java-AXP could resolve all references.
Then I copied XPSParser.java to the root of my project and made sure all imports work.
In DocSaver.java on line 90 I found where XPSParser was used and I adapted the code so I could convert the XPS file to text:
val xpsFile = File(path + "filename.xps")
val inputStream = FileInputStream(xpsFile)
val metadata = Metadata()
val handler = BodyContentHandler()
XPSParser().parse(inputStream, handler, metadata, ParseContext())
val docContents = handler.toString()
println(docContents)
inputStream.close()
Hope it helps someone.