21

I'm not sure what the "general" name of something like this might be. I'm looking for a library that gives me a file format to store different types of binary data in an expanding single file.

  • open source, non-GPL (LGPL ok)
  • C interface
  • the file format is a single file
  • multiple files within using a POSIX-like file API (or multiple "blobs" within using some other API)
  • file/structure editing is done in-place
  • reliable first, performant second

Examples include:

Problems with the above:

  • whefs doesn't appear to be very mature, but best describes what I'm after
  • HDF, CDF, NetCDF are usable (also very reliable and fast), but they're rather complicated and I'm not entirely convinced of their support for opaque binary "blobs"

Edit:
Forgot to mention, one other relevant question:
Simple Virtual Filesystem in C/C++
Another similar question:
Is there an open-source alternative to Windows compound files?

Edit:
Added condition of in-place editing.

Edit:
whefs superseded by: whio_epfs

Community
  • 1
  • 1
Ioan
  • 2,382
  • 18
  • 32

5 Answers5

6

This appears to do what I was looking for: libgsf

Still need to test its reliability/performance and how cross-platform the binary format is.

Ioan
  • 2,382
  • 18
  • 32
  • Isn't it just for archives like gzip? What kind of virtual filesystem does libgsf support? – Chris Pillen Feb 06 '14 at 18:53
  • 1
    It supports GZip, but the use case here was more for support of this format: http://msdn.microsoft.com/en-us/library/dd942138.aspx – Ioan Feb 06 '14 at 21:27
  • Understood. One last question: What do you think about POLE? And does libgsf even create such OLE container with a fixed size, finally? – Chris Pillen Feb 08 '14 at 12:54
  • 1
    I'm not familiar with that format and saw no mention of it in the libgsf docs. – Ioan Feb 10 '14 at 13:23
  • SQLite (http://sqlite.org/) will do the job in a robust way, with added relational sauce should you require it. – MikeW Jun 05 '17 at 10:06
  • @MikeW I thought of it as well, but at the time this question was written, it didn't perform well enough for my use case. – Ioan Jun 06 '17 at 11:53
  • @Ioan: well the requirement put "performance second" .... – MikeW Jun 07 '17 at 10:31
  • @MikeW Yes it did; that's why I mentioned for my use case. There was a minimum, even for secondary requirements. – Ioan Jun 07 '17 at 12:04
0

It sounds like you're talking about the Linux loopback device, which lets you treat a file on a filesystem as a first-class block device (and then proceed to mkfs, mount, etc.)

(What sort of platform are you targetting? A fully-featured Unixlike? Something in the embedded space with a small footprint?)

crazyscot
  • 11,819
  • 2
  • 39
  • 40
  • Cross-platform is the intent. Certainly Windows and an embedded Linux version. I considered the loopback device, but didn't mention it because I wasn't sure whether it could grow in size and it won't work on Windows. – Ioan Feb 25 '10 at 20:08
  • Right. If I were in your shoes, then, I'd also be looking into the various log-structured filesystems to see if there was one with an acceptable license which you could hack up well enough to work in userland and to work off a file as backing store (as opposed to a block device). – crazyscot Feb 25 '10 at 22:11
0

The WxWindows library supports ZIP files (see http://docs.wxwidgets.org/stable/wx_wxarc.html#wxarc). This has also the advantage that you can look at the contents using a ZIP manager (e.g. WINZIP).

A commercial alternative is ChillKat (http://www.chilkatsoft.com/)

If security is a concern, encrypt the file contents and mangle the file names in the ZIP archive.

Patrick
  • 23,217
  • 12
  • 67
  • 130
  • Security isn't a concern. I haven't looked into normal archive types too much, but I wonder how well they perform with regards to large data sets, high speed, random access of the internal file contents, and simultaneous reader/writer... – Ioan Feb 25 '10 at 21:03
0

Eet library from the Enlightenment project maybe?

http://en.wikipedia.org/wiki/Enlightenment_Foundation_Libraries#EET http://docs.enlightenment.org/api/eet/html/

kazanaki
  • 7,988
  • 8
  • 52
  • 79
  • Nice idea, but not mature yet. It depends on Eina which is currently unstable. Also, it basically stores a hash map of "chunks", which requires an extra layer to manage them. – Ioan Mar 11 '10 at 19:07
  • The Ogg container format is also very similar to Eet in simplicity: http://www.xiph.org/ogg/doc/ – Ioan Mar 12 '10 at 14:42
0

What about BerkeleyDB? It's not exactly a filesystem but it's quite transparent to store 'binary data' in a file. License seems to be quite permissive as well.

lorenzog
  • 3,483
  • 4
  • 29
  • 50
  • According to http://www.oracle.com/technology/software/products/berkeley-db/htdocs/licensing.html it requires a commercial license for closed-source applications. Also, I don't know how much faster it is than SQLite, which was a few times slower than simply storing directly to a file. This was a simple test, batch commit to store data. – Ioan Mar 22 '10 at 12:24
  • Well you did not mention you were doing a closed-source application. And yes, it might be just ike SQLite, but with a few more years of stability on the embedded side. – lorenzog Mar 22 '10 at 15:33
  • Sorry, my comment was misleading. The current BerkeleyDB open-source license is GPL-like (as in, it requires your entire application source be released), violating one of the requirements I mentioned in the question. – Ioan Mar 29 '10 at 18:23