9

I need to store about 600,000 images on a web server that uses NTFS. Am I better off storing images in 20,000-image chunks in subfolders? (Windows Server 2008)

I'm concerned about incurring operating system overhead during image retrieval

Brian Webster
  • 30,033
  • 48
  • 152
  • 225
  • 20,000 files is still a lot. better split in the order of 1000 files/dir – Javier Apr 12 '10 at 21:32
  • I suppose I'd be tempted to ask if 600 directories all within one directory becomes an issue. I imagine operating systems are optimized for less than 100. I don't think 600 would be an issue though. – Brian Webster Apr 12 '10 at 21:33
  • 1
    It used to be the case that putting that many files in a directory in NTFS would do damage to the FS itself, causing growth of some table somewhere that would never shrink. Did that get fixed? (Sorry for being vague here; I forget the details.) – Donal Fellows Apr 12 '10 at 21:35
  • 2
    It should be OK as long as the white ones have an extension of .6. – xpda Apr 12 '10 at 23:54
  • Does this answer your question? [NTFS performance and large volumes of files and directories](https://stackoverflow.com/questions/197162/ntfs-performance-and-large-volumes-of-files-and-directories) – Frédéric May 14 '20 at 10:28

3 Answers3

8

Go for it. As long has you have an external index and have a direct file path to each file with out listing the contents of the directory then you are ok.

I have a folder with that is over 500 GB in size with over 4 million folders (which have more folders and files). I have somewhere in the order of 10 million files in total.

If I accidentally open this folder in windows explorer it gets stuck at 100% cpu usage (for one core) until I kill the process. But as long as you directly refer to the file/folder performance is great (meaning I can access any of those 10 million files with no overhead)

Pyrolistical
  • 27,624
  • 21
  • 81
  • 106
4

Depending on whether NTFS has directory indexes, it should be alright from the application level.

I mean, that opening files by name, deleting, renaming etc, programmatically should work nicely.

But the problem is always tools. Third party tools (such as MS explorer, your backup tool, etc) are likely to suck or at least be extremely unusable with large numbers of files per directory.

Anything which does a directory scan, is likely to be quite slow, but worse, some of these tools have poor algorithms which don't scale to even modest (10k+) numbers of files per directory.

MarkR
  • 62,604
  • 14
  • 116
  • 151
1

NTFS folders store an index file with links to all its contents. With a large amount of images, that file is going to increase a lot and impact your performance negatively. So, yes, on that argument alone you are better off to store chunks in subfolders. Fragments inside indexes are a pain.

Shyam
  • 2,357
  • 8
  • 32
  • 44
  • While you might be right, is this based on anything but assumption ? We've folders with 450k files, there's no problems so far - though I guess browsing them, even in a file manager wouldn't be fast. – leeeroy Apr 12 '10 at 21:40
  • Actually, it is not assumption (experience with various implementations of NTFS and mostly its limitations). It's an error by design. My answer was to clarify if it would affect performance (yes), not if it would give 'problems' such as failure. The index file is not a smart file. If it grows you get more maintenance of getting the fragments back. And I quote myself: Fragments inside indexes are a pain. – Shyam Apr 12 '10 at 21:47