5

So, my application depends on a huge number of small files. The actual number is somewhere around 90,000. Now, I use a component that needs an access to these files, but the only way it accepts them is by the use of an URI.

So far I have simply added a directory containing all the files to my debug-folder while I have developed the application. However, now I have to consider the deployment. What are my options on including all these files with my deployment?

So far I have come up with a couple of different solutions, none of which I've managed to make work completely. First was to simply add all the files to the installer which would then copy them to their places. This would, in theory at least, work, but it'd make maintaining the installer (a standard MSI-installer generated with VS) an absolute hell.

The next option I came up with was to zip them into a single file and add this as a part of the installer and then unzip them by the use of a custom action. The standard libraries, however, do not seem to support complex zip-files, making this a rather hard option.

Finally, I realized that I could create a separate project and add all the files as resources in that project. What I don't know is how do the URIs pointing to resources stored in other assemblies work. Meaning, is it "standard" for everything to support the "application://,,,:Assembly"-format?

So, are these the only options I have, or are there some other ones as well? And what would be the best option to go about this?

bobblez
  • 1,340
  • 20
  • 35

4 Answers4

4

I would use a single zip-like archive file, and not unzip that file on your hard disk but leave it as is. This is also the approach used by several well known applications that depend on lots of smaller files.

Windows supports using zip files as virtual folders (as of XP), users can see and edit their content with standard tools like Windows Explorer.

C# also has excellent support for zip files, if you're not happy with the built in tools I recommend one of the major Zip libraries out there - they're very easy to use.

In case you worry about performance, caching files in memory is a simple exercise. If your use case actually requires the files to exist on disk, also not an issue, just unzip them on first use - it's just a few lines of code.

In short, just use a zip archive and a good library and you won't run into any trouble.

In any case, I would not embed this huge amount of files in your application directly. Data files are to be separate.

mafu
  • 31,798
  • 42
  • 154
  • 247
  • This sounds like a good solution. However, I was unable to find any decent resources on how to go about using zip-files that contain anything more than a singular file. Could you perhaps point me towards one? – bobblez Apr 18 '11 at 13:54
  • Also, is it possible to create URIs pointing to files that are in a zip-file so that the OS/framework/magic takes care of the actual mapping to the file? – bobblez Apr 18 '11 at 14:03
  • I'm not sure I got your question correctly - are you asking which library to use? Be pointed towards http://stackoverflow.com/questions/374396/open-source-zip-library-for-net as well as http://stackoverflow.com/questions/940582/how-do-i-zip-a-file-in-c-using-no-3rd-party-apis, possibly also http://stackoverflow.com/questions/449998/free-compression-library-for-c-which-supports-7zip-lzma. Especially SharpZipLib is generally considered a great tool. – mafu Apr 18 '11 at 14:08
  • Yes, it is possible. Try it yourself: Create a zip file and add a directory and a file in Explorer. The path is going to look like `C:\somewhere\archive.zip\folder\file.txt`. This is a virtual folder, and most correctly programmed applications should be able to cope with it. You can even open the file by typing the path into the shell as is. – mafu Apr 18 '11 at 14:11
  • ps: DotNetZip has a command-line tool that you could use in your build script, to produce the zipfile containing your 90,000 files. But keep in mind that 90,000 is over the normal limit of entries in a standard zip file, and you will need to use zip64 format for that number of files. Check the DotNetZip documentation for more information. I don't know if the shell support for zip files in Windows supports zip64; you'd have to check. – Cheeso Apr 18 '11 at 17:53
  • Hmm, now that I finally got around to actually testing this, it does not seem to work for me. Could having WinRAR installed (and set as the default program for handling .zip-files) be causing this? I can't seem to make it work from command line nor from code (System.IO.... throws a TargetInvocationException). I tried it with both the zip-file containing all the files, and a smaller one containing only about 100 files, but neither worked :( Oh and I'm using w7 x86. – bobblez Apr 19 '11 at 05:49
  • @bobblez I've got w7 and WinRAR as default handler, too. The example I gave earlier works fine for me. What exactly is the problem? – mafu Apr 19 '11 at 09:17
  • Also, if 90k files in 1 zip file creates trouble, simply split it into 2 or more files, possibly grouping the files in some way (one file for images, one file for text, one file for the rest - or whatever seems logical in your case, you're free to chose). – mafu Apr 19 '11 at 09:19
  • Another common approach is to add all files at one point in time to the first archive. Later, when files get updates, a second (separate) archive is added that _adds_ new content as well as overwrites (_updates_) files from the first archive. Later on, a third, and so on, just go by (for instance) Data001.zip, Data002.zip, ... to be read by your software in order. This allows for very easy updates to your software. – mafu Apr 19 '11 at 09:22
  • I'm not at my coding environment right now but I think the exception I got was TargetInvocationException, and it simply stated that the path I specified could not be found. I tried the same via command line myself, and a same kind of an error from windows, ie. cannot find the file specified, or something of the sort, with my very simple zip-file that contained just one txt-file. – bobblez Apr 19 '11 at 15:41
  • And yes, splitting the files into multiple zip-files is not a problem and I'll probably do that anyways. – bobblez Apr 19 '11 at 15:43
  • Regarding the exception, I've got no clue as to why this appears. Maybe create a question over at superuser.com and include a screenshot and reproduction details. It definitely works for me so I think you've got a glitch somewhere. – mafu Apr 20 '11 at 08:34
  • Oh well, seems like I'll have to extract the files after all. Asked it on superuser.com and the answer I got was that the virtualization of the zip-files as folders is a feature of explorer.exe, not the windows file system :/ – bobblez Apr 20 '11 at 13:33
  • I see, too bad. Well, can't you still use a zip library to access the files? – mafu Apr 20 '11 at 13:41
  • Well not for accessing them without extraction. Like I mentioned, I need to be able to provide URIs to the files. So the only option left is to extract the files as a part of the installation (or on first start, but I think I'll rather do it at the installation). – bobblez Apr 20 '11 at 13:51
  • Yeah. I guess you have to bite that apple :\ – mafu Apr 20 '11 at 17:46
0

You could include the files in a zip archive, and have the application itself unzip them on first launch as part of a final configuration, if it's not practical to do that from the installer. This isn't entirely atypical (e.g. it seems like most Microsoft apps do a post-install config on first run).

Depending on how the resources are used, you could could have a service that provides them on demand from a store of some kind and caches them, rather than dumping them somewhere. This may or may not make sense depending on what these resources are for, e.g. if they're UI elements a delay on first access might not be acceptable.

You could even serve them using http from a local or non-local server, or a SQL server if it's already using one, caching them as well, which would be great for maintenance, but may not work for the environment.

I wouldn't do anything that involves an embedded resource for each file individually, that would be hell to maintain.

Jamie Treworgy
  • 23,934
  • 8
  • 76
  • 119
  • The problem with unzipping on first run is that the app may not have write access to its installation directory. – Greg Apr 18 '11 at 13:56
  • You could put them in the user application data folder. Not ideal, but based on the amount of garbage that other applications seem to dump there, not unusual either. – Jamie Treworgy Apr 18 '11 at 14:02
  • True. The OP would need to decide if it's OK to generate 90,000 files per user. I hope those files wouldn't be in the roaming part of a User's Profile. – Greg Apr 18 '11 at 14:11
  • Not ideal to be sure. I'm surprised there's no simple way to extract a bunch of files using an MSI installer though can't say I've ever tried. Extracting them in real time as-needed from a zip archive would almost certainly be perceptibly slow. One could do something complicated like use an ISO image of the file structure and read from that, which would be a lot faster than a zip archive, but seems like overkill. – Jamie Treworgy Apr 18 '11 at 14:16
0

Another option could be to create a self-extract zip/rar archive and extract it from the installer.

Elalfer
  • 5,312
  • 20
  • 25
0

One of the options is to keep them in compound storage and access them right in the storage. The article on our site describes various types of storages and their advantages / specifics.

Eugene Mayevski 'Callback
  • 45,135
  • 8
  • 71
  • 121