I've been working on a project which uses apache-poi to read .PPT files and change some attributes of SlideShowDocInfoAtom record in ppt file.
I can read the file using HSLFSlideShow, however, when it comes to a large ppt file (e.g. over 1GB), and my application jvm max heap size is restricted to 2GB, poi throws an OutOfMemorry Error.
After reading the source code, I know it will create a byte array when reading one of the streams of the file. In the 1GB file, the PowerPoint Document stream in the file will be up to 1GB, which consumes 1GB memorry space to create byte array, and somehow causes the jvm to crash.
So, is there any way that I can read large ppt file without enlarging jvm heap size, as I only want to read some doc info of this file, don't really want to read large blocks of the file such as audios or videos into memorry.