7

I would like to store some Application-Related Metadata for Files, and NTFS Alternate Data Streams (AltDS) would allow me to store this metadata directly on the files rather than in a separate database.

I just don't feel like this is a good idea. I know that this only works on NTFS, but at least if the user copies/moves the files to a Non-NTFS drive they get a Warning from Windows (yeah, yeah, no one reads warnings, I know)-

But also, storing additional data on a file can become very wasteful, as the AltDS stay even if my Application is uninstalled. It's like a decade ago when people used "Registry Cleaners" to remove useless entries from the registry after uninstalling a program to make their system run faster (and less stable when the cleaner cleaned too much...).

I just wonder what they can be reasonably used for? Should they be completely left for Microsoft Apps to use? Or is there some sort of common policy what types of apps may use them (apart from malware)?

Edit: Just to clarify what my idea was. I'm in the early stages of writing a small document management system for myself. Because I want to have the freedom to move files around, I want to store metadata on the file so that if I move/rename/modify them, my app still recognizes them. It could either be the entire Metadata or just a GUID that works with a separate database.

To summarize the points given:

Pros:

  • Metadata moves with the file, so no need to recognize it through hashing or filename
  • Works with all FileTypes, even .txt files where it's impossible to store any data in the file itself

Cons:

  • Only works on NTFS which may not be the default file system in future Windows Versions
    • Although it would surprise me if MS doesn't automatically convert them if they ever get WinFS together
  • AltDS remain even if my App is uninstalled
  • Privacy concerns
  • Fragile
    • Most USB Sticks are FAT32. Many private file servers are Linux. Downloading a file from the internet should only transfer the file but not the streams. In short: It's rather easy to lose them.
phuclv
  • 37,963
  • 15
  • 156
  • 475
Michael Stum
  • 177,530
  • 117
  • 400
  • 535

8 Answers8

6

Another sticking point: Backup software. Some ignores it, some doesn't restore it, and some support it but don't do anything without you telling it to.

Goyuix
  • 23,614
  • 14
  • 84
  • 128
4

It's hard to say without more information about the kind of data you're storing. You seem to be aware of some of the concerns involving their use, so I'm not sure how much I can help. Here's my general thoughts on alternate data streams, though:

First of all, as you've noted, AD streams only work on NTFS. If there's any chance you'll need to store this metadata on a FAT filesystem, you'll need some kind of fallback mechanism. Modern PCs will probably have NTFS-formatted internal hard drives, but most USB flash drives you encounter are still FAT-formatted. Keep that in mind if your users will be storing data files on flash drives.

Aside from that, I can't think of any technological reasons to avoid AD streams, but I'd still be wary of using them. People tend to be nervous about applications that "hide" data from them, regardless of the intent. Consider the Sony rootkit fiasco, and so on. I'm not saying your application is anywhere near as bad as that, but people (especially the less tech-savvy) may not make out the distinction. Still, I will allow that they might have a valid use for your application. The problem of leaving the AD streams behind after uninstallation is still very real, of course. You might want to consider giving people running the uninstaller the option of running a program to search their drive(s) and clean up any remaining streams.

Also, remember the KISS principle. Is the use of AD streams really the simplest way to effectively solve your application's metadata storage problem? If so, maybe AD streams are a good idea, but, if not, I'd seriously consider taking another approach.

bcat
  • 8,833
  • 3
  • 35
  • 41
  • Thanks. The data is Metadata for a document management system. It's mainly something for myself and in early planning, but I came across AltDS and though to gather some opinions. The "hiding data" part is actually a good point. Only very few users are actually really trying to protect themselves from Malware (Pretty much everyone is willing to install crap if it has a cute mascot), but most people are quick to chime in once someone calls out an App as being Malware, even if it's unfounded. – Michael Stum Dec 30 '09 at 04:16
3

I can think of one good reason not to use them, and that's this little tidbit from their "how to use" guide:

Alternate data streams are strictly a feature of the NTFS file system and may not be supported in future file systems. However, NTFS will be supported in future versions of Windows NT.

Now... the way this is worded, I guess, technically you're safe. But if Microsoft ever decides to supersede/deprecate NTFS - and they did come pretty close at one point - then you're going to have to scramble to upgrade your software so it runs on newer machines.

As unlikely as that possibility may seem now, I think it's less unlikely than suddenly finding yourself unable to wire up a SQLCE database or XML file stored in the user's AppData.

Having said that, I'm sure that there are some scenarios that justify the use of ADS. In my opinion it's just one of those cases where, if you aren't absolutely sure that it's the right tool, then it's probably the wrong one.

Attaching metadata to files in general is a dangerous game. Just look at the unholy mess that is ID3 and the embarrassing results of people leaving the EXIF data in images.

P.S. Registry cleaners aren't used anymore? Why didn't anybody tell me!?

Aaronaught
  • 120,909
  • 25
  • 266
  • 342
  • 1
    Good point. Yes, WinFS was supposed to supersede NTFS, but how I understood it, WinFS still supports Metadata (as it's essentially an OS-embedded SQL Server). As for ID3 and EXIF: They are stored in the file itself, but it's a good point. I haven't checked what happens if I download a file that is served from an NTFS Server through IIS to an NTFS File system - I would hope that it kills the AltDS, but haven't checked. My usage would be similar to EXIF, it's for a document management system where I want to have the Data (just a GUID actually) move when the file is moved. – Michael Stum Dec 30 '09 at 04:19
2

Alternate data streams are essential to NTFS and will always be supported. When the file they are attached to gets deleted they get deleted as well - so no worries about them "sticking around"

As all the others have said, there are issues with backup, copy to other filesystem and paranoia regarding ADS.

Dominik Weber
  • 711
  • 5
  • 13
1

If your app can function without that data, for example recreating it as necessary, the data streams are perfectly acceptable.

Given how they are used in windows, I don't think they are going away anytime soon.

NotMe
  • 87,343
  • 27
  • 171
  • 245
1

Bad idea for you, bad idea for MS. I think they were really an attempt to compete with the Mac's data and resource fork file architecture back in the day. If the Mac FS files can have 2 forks, then our will have unlimited "forks", and maybe we'll eventually figure out how to use them.

President James K. Polk
  • 40,516
  • 21
  • 95
  • 125
  • I was thinking about those. As I understand it, Mac OS uses the second fork for metadata like the Application and Type descriptors for files (as it doesn't rely on file extensions), which allows us to say "I want to open this .txt file with that Editor, but this other .txt file with another editor". That works because it's neatly inegrated with Finder, whereas Windows Explorer only seems to use AltDS for the annoying "Can't execute this file as it's from the Internet" message. – Michael Stum Dec 30 '09 at 04:12
  • @MichaelStum: The second fork on an "old Mac OS" file holds a non-hierarchical collection of resources, each of which has a 32-bit type (generally represented by four printable ASCII characters), a 16-bit ID, an optional name, and an arbitrary quantity of data (I think up to two gigs). Application files often have everything in the resource fork and nothing in the data fork, though there are exceptions. Some text editors store raw text in the data fork, and things like tab stops in the resource fork. Some applications store everything in their documents' resource fork. – supercat Mar 01 '12 at 22:15
  • 1
    Actually, ADS were added to NTFS specifically to be compatible with MacOS forks, not to "compete" with them. There were tools to translate resource forks into consistently-named ADS and vice-versa in order to avoid data loss when you moved files across operating systems. – Ti Strga Apr 25 '14 at 20:24
1

Adding an AltDs to a file as a way to tie an application-specific string around it has the problem you cite: no cleanup. And if the file gets copies, your stuff follows it around. For this case, keeping a separate database is probably more virtuous.

If the file, on the other hand, is very much under your own control, then if AltDs is an efficient way to do the job, go ahead.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
0

One thing I did NOT hear so far is using AltDS in applications where certain kind of information MUST be hidden (i.e. Medical applications), while it is desired to NOT hide other kind of information.

The reason I LOVE AltDS is exactly that: I can design a Medical Imaging system, where I keep medical images in the open (as BMP, i.e.) w/o any patient information details, because those I can keep in an AltDS. Bingo. Advantage: If somebody copies the files to a thumb-drive - well, all that person gets is the BMP w/o the patient info.

Backup/Restore is always nasty - my solution was to move to a proprietary file-format on the backup, where the patient info is encoded/encrypted in the same file as the (raw) BMP.

Lastly, if you store the hidden information in XML format, your application may be gone but the information is still there. The information should be linked to the file itself, not the application. That should probably be stored somewhere else.

Overall I L-O-V-E AltDS. The lack of OS support (can't see the AltDS data), lack of general/public knowledge (who? what? Ads? What kinda advertisements) and the fact that I don't have to worry about that additional information to keep in sync with the main file (ahem Stream) allows me to design and implement really robust systems. The Backup is a bummer - especially Joliet should have been designed to handle those AltDS - but I can live with it.

Just my 2c (well, maybe 3c...).

  • 2
    If the Thumb Drive is NTFS Formatted the AltDS goes with it, but all the information in that has to be encrypted anyway. But your scenario is exactly the one I had in mind: Attach Metadata while still keeping the file itself unchanged/intact. – Michael Stum Dec 20 '10 at 22:23
  • 7
    The way this was written makes me very concerned as a security engineer. Alternate Data Streams are not a security feature, and just because you can't see them in windows explorer doesn't mean that they are a security barrier or they can't be viewed. If you encrypt the metadata that is fine, but relying on the fact that they aren't available via explorer or sometimes aren't copied along with the file are very dangerous security mitigations. – Peter Oehlert Dec 29 '10 at 22:22
  • 1
    aka "security through obscurity". I shudder at the idea of medical Privacy Act data being easily gotten by anybody who knows how to type ":space_dude_alien's_secret_stuff" at the end of my X-ray bitmaps... – Ti Strga Apr 25 '14 at 20:27