13

I'm looking for a lightweight version of poi-3.8.jar to use it in an Android (private) app. I don't seem to be able to fit the whole 1.7Mb jar in the APK for some reason (and it would be wrong to do so anyway) and since I'm only looking for the doc -> html and xls -> html functionality, I'm not quite sure I need the whole jar file.

I've spent a couple hours trying to figure out how to extract org.apache.poi.hwpf.converter.WordToHtmlExtracter.java in poi/hwpf/converter but it looks like it's using a lot of other stuff. Even if this doesn't really surprise me, I was thinking that maybe someone here would know which packages I can get rid of to make the jar smaller. I'll be glad to spend more time on it, unless someone here tells me it's a waste of time and that EVERYTHING in the sources is needed to convert doc to html files.

I don't need anything that displays anything, I just need the "simple" doc to html (and xls to html if possible) features. I don't need anything related to PDF, powerpoint, outlook or whatever.

I'll be glad to share whatever I find out

Cheers

f_puras
  • 2,521
  • 4
  • 33
  • 38
Johann Hilbold
  • 201
  • 1
  • 2
  • 8

3 Answers3

7

Well I was able to do most of what I was asking for here. That is importing the jar files. I had at least 2 kinds of problems: - not enough RAM on Eclipse which made dexing my classes crash most of the time (fixed by adjusting the Xmx and xms values in Eclipse.ini) - the 64k method limit for each DEX file made things complicated. I had to split all the required POI jars into several DEX files. (I did that by following the tutorial from the Android blog: http://android-developers.blogspot.com/2011/07/custom-class-loading-in-dalvik.html )

The real answer to my question is: "yes you need everything in the jar". I made it work for the basic "non open xml" files. My app does the conversion to html quite well, and it's fast enough too.

On a side note, I was also trying to do the same thing with "open XML" files, and it's much more complicated. My little project doesn't do what it's supposed to do, I've got some weird exception when initializing the XMLBeans class. Here's my trace (sorry for the ugliness):

12-19 12:07:10.790: W/dalvikvm(13385): Exception
Ljava/lang/RuntimeException; thrown while initializing
Lorg/apache/xmlbeans/impl/regex/SchemaRegularExpression;
12-19 12:07:10.790: W/dalvikvm(13385): Exception
Ljava/lang/ExceptionInInitializerError; thrown while initializing
Lorg/apache/xmlbeans/impl/schema/BuiltinSchemaTypeSystem;
12-19 12:07:10.790: D/dalvikvm(13385): Method.invoke() on bad class
Lorg/apache/xmlbeans/impl/schema/BuiltinSchemaTypeSystem; failed
12-19 12:07:10.790: W/dalvikvm(13385): Exception
Ljava/lang/ExceptionInInitializerError; thrown while initializing
Lorg/apache/xmlbeans/XmlBeans;
12-19 12:07:10.790: W/System.err(13385):
java.lang.reflect.InvocationTargetException
12-19 12:07:10.790: W/System.err(13385):    at
java.lang.reflect.Method.invokeNative(Native Method)
12-19 12:07:10.790: W/System.err(13385):    at
java.lang.reflect.Method.invoke(Method.java:491)
12-19 12:07:10.790: W/System.err(13385):    at
t.fze.TestOfficeAndroidActivity.onCreate(TestOfficeAndroidActivity.java:55)
12-19 12:07:10.790: W/System.err(13385):    at
android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1048)
12-19 12:07:10.790: W/System.err(13385):    at
android.app.ActivityThread.performLaunchActivity(ActivityThread.java:1712)
12-19 12:07:10.790: W/System.err(13385):    at
android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:1764)
12-19 12:07:10.790: W/System.err(13385):    at
android.app.ActivityThread.access$1500(ActivityThread.java:122)
12-19 12:07:10.790: W/System.err(13385):    at
android.app.ActivityThread$H.handleMessage(ActivityThread.java:1002)
12-19 12:07:10.790: W/System.err(13385):    at
android.os.Handler.dispatchMessage(Handler.java:99)
12-19 12:07:10.790: W/System.err(13385):    at
android.os.Looper.loop(Looper.java:132)
12-19 12:07:10.790: W/System.err(13385):    at
android.app.ActivityThread.main(ActivityThread.java:4025)
12-19 12:07:10.790: W/System.err(13385):    at
java.lang.reflect.Method.invokeNative(Native Method)
12-19 12:07:10.790: W/System.err(13385):    at
java.lang.reflect.Method.invoke(Method.java:491)
12-19 12:07:10.790: W/System.err(13385):    at
com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:841)
12-19 12:07:10.790: W/System.err(13385):    at
com.android.internal.os.ZygoteInit.main(ZygoteInit.java:599)
12-19 12:07:10.790: W/System.err(13385):    at
dalvik.system.NativeStart.main(Native Method)
12-19 12:07:10.790: W/System.err(13385): Caused by:
org.apache.poi.POIXMLException:
java.lang.reflect.InvocationTargetException
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:62)
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:414)
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:155)
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:174)
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:63)
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.ss.examples.html.ToHtml.create(ToHtml.java:139)
12-19 12:07:10.790: W/System.err(13385):    at
org.apache.poi.ss.examples.html.ToHtml.create(ToHtml.java:123)
12-19 12:07:10.790: W/System.err(13385):    ... 16 more
12-19 12:07:10.790: W/System.err(13385): Caused by:
java.lang.reflect.InvocationTargetException
12-19 12:07:10.790: W/System.err(13385):    at
java.lang.reflect.Constructor.constructNative(Native Method)
12-19 12:07:10.790: W/System.err(13385):    at
java.lang.reflect.Constructor.newInstance(Constructor.java:416)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60)
12-19 12:07:10.800: W/System.err(13385):    ... 22 more
12-19 12:07:10.800: W/System.err(13385): Caused by:
java.lang.ExceptionInInitializerError
12-19 12:07:10.800: W/System.err(13385):    at
org.openxmlformats.schemas.drawingml.x2006.main.ThemeDocument$Factory.parse(ThemeDocument.java:71)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.poi.xssf.model.ThemesTable.<init>(ThemesTable.java:38)
12-19 12:07:10.800: W/System.err(13385):    ... 25 more
12-19 12:07:10.800: W/System.err(13385): Caused by:
java.lang.ExceptionInInitializerError
12-19 12:07:10.800: W/System.err(13385):    at
java.lang.reflect.Method.invokeNative(Native Method)
12-19 12:07:10.800: W/System.err(13385):    at
java.lang.reflect.Method.invoke(Method.java:491)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.XmlBeans.getNoType(XmlBeans.java:856)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.XmlBeans.<clinit>(XmlBeans.java:881)
12-19 12:07:10.800: W/System.err(13385):    ... 27 more
12-19 12:07:10.800: W/System.err(13385): Caused by:
java.lang.ExceptionInInitializerError
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.schema.BuiltinSchemaTypeSystem.fillInType(BuiltinSchemaTypeSystem.java:1025)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.schema.BuiltinSchemaTypeSystem.<clinit>(BuiltinSchemaTypeSystem.java:223)
12-19 12:07:10.800: W/System.err(13385):    ... 31 more
12-19 12:07:10.800: W/System.err(13385): Caused by:
java.lang.RuntimeException: Installation Problem???  Couldn't load
messages: Can't find resource for bundle
'org.apache.xmlbeans.impl.regex.message_fr_FR', key ''
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.RegexParser.setLocale(RegexParser.java:88)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.RegexParser.<init>(RegexParser.java:78)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.ParserForXMLSchema.<init>(ParserForXMLSchema.java:28)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.RegularExpression.setPattern(RegularExpression.java:2996)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.RegularExpression.setPattern(RegularExpression.java:3009)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.RegularExpression.<init>(RegularExpression.java:2975)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.SchemaRegularExpression.<init>(SchemaRegularExpression.java:27)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.SchemaRegularExpression.<init>(SchemaRegularExpression.java:23)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.SchemaRegularExpression$1.<init>(SchemaRegularExpression.java:44)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.SchemaRegularExpression.buildKnownPatternMap(SchemaRegularExpression.java:43)
12-19 12:07:10.800: W/System.err(13385):    at
org.apache.xmlbeans.impl.regex.SchemaRegularExpression.<clinit>(SchemaRegularExpression.java:38)
12-19 12:07:10.800: W/System.err(13385):    ... 33 more
Johann Hilbold
  • 201
  • 1
  • 2
  • 8
  • 2
    Would you be interested in sharing your code for Android so far? It seems like there's a lack of interest in being able to read MS Office documents on Android - I haven't been able to find anything useful other than your post. On a side-note there's a lot of closed source API's and I've tried to contact the different companies but none of them have answered any of my queries and I'm desperately trying to find a decent (or actually any) solution for handling documents on Android ;\ As you I only need the conversion to html for the different formats as I'm only interested in displaying the docs. – Darwind Dec 27 '11 at 21:55
  • 3
    Hi Darwind, yes you can take a look at my code. Actually, I was able to make it work (yay) with POI. I posted a full explanation on my company's blog (sorry, it's in French!) part 1 (simple POI usage on Android): http://blog.oxiane.com/2011/12/30/visualiser-un-fichier-office-doc-xls-ppt-sous-android/ part 2 (for Office 2007+ documents) http://blog.oxiane.com/2011/12/30/visualiser-un-fichier-office-doc-xls-ppt%E2%80%A6-sous-android-23/ Or you can take a look at my code (quite messy, but it works!) https://code.google.com/p/display-msoffice-docs-android-with-apache-poi/ – Johann Hilbold Jan 02 '12 at 00:25
  • On a side note, I found out (shortly after killing myself with this POI port) that there's a MUCH simpler solution to handle Office 2007+ documents. I used this lib: http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2006/11/21/openxmlandjava.aspx It doesn't work with "binary" Office documents (word 2003...) so you'd still need to implement POI for those files, but it's way easier than porting POI. In fact it took me only a very few tweaks to make it work. I haven't shared my code yet though. – Johann Hilbold Jan 02 '12 at 00:34
  • Impressive work so far ;-) I tried out the apk in your repo and it looks like it's working although the conversion to html disregarded the colors in the spreadsheet. The app however dumps around 38 MB of data on the SD card - the dex files of course takes up a lot of space. As the app takes up so much space on the SD card it really isn't an "acceptable" solution for me I'm afraid ;\ I'm gonna have a look at the POI library later this week and see if I can somehow scrape the docToHtml and xlsToHtml out of the library somehow. Thank you so much for sharing your findings and code so far ;-) – Darwind Jan 02 '12 at 22:15
  • Hi, yes, I know about the missing color from the output and the incredible size taken on the SD card. I did that POI port more for the challenge :) I suggest you take a look at that other link http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2006/11/21/openxmlandjava.aspx it's much lighter. I'll put some code somewhere whenever i get a chance. – Johann Hilbold Jan 03 '12 at 08:54
  • Hi, I'm trying to split the jar files into multiple dex files by following the link mentioned above, but I still get the 'too many methods' error when I run the build.xml. Could you please tell me how you split the jars into multiple dex files? Thanks a bunch :) – neeraj narang Jun 09 '13 at 03:49
  • Hi there, the trick is to make more jars, so that they contain less methods. if you take a look at the code I have https://code.google.com/p/display-msoffice-docs-android-with-apache-poi/source/browse/#svn%2Ftrunk%2FTestOfficeAndroid, the jars are reallysmall. I remember that it's a real pain to make it work, i would suggest that you add jars one by one so you know which one has too many methods. Or you could simply take my jars from the link above. Except the POI version I used is about 2 years old... good luck! – Johann Hilbold Jun 11 '13 at 08:20
3

You could as well use ProGuard shrinking. It can decrease the size of apk up to several times.

bvk256
  • 1,837
  • 3
  • 20
  • 38
  • thanks for the idea, but unfortunately it was not the size of the APK I was trying to reduce, but the number of methods in the application.v If you read all the comments above, my main problem was that there were more than 64k methods in the jar file I wanted to import. And 64k methods is just too many, it wouldn't let me compile the APK! – Johann Hilbold Apr 14 '14 at 08:38
2

I've created a "port"(if I may say so) of XSSF recently: https://stackoverflow.com/a/25564538/2155217

It's enough for reading and writing XLSX files. Might not work properly if file contains some extra features such as Drawings or Charts.

Community
  • 1
  • 1
  • Hi Andrew! awesome work you did there. I haven't tested it yet, but it looks like you use the method "run > see what failed > add that file to the jar". That was also my first way of doing it, until I realized most of the jar was actually needed. Have you been able to identify more precisely what features are missing? – Johann Hilbold Nov 04 '14 at 13:12