0

Had a proposed solution from a co-worker to setup a caching layer for files that never (almost never) get updated. Currently the Ruby application has to fetch content from DB, render a page and serve for each request, with exception of caching images/css/js in Akamai CDN.

The performance is terrible when there are 5K users on the site.

Proposed solution was to generate 15 million static pages once (4T), store in a single directory on a NFS server, then share among 9 Apache/Phusion-Passenger servers, and set Apache to serve static content from mounted NFS share.

In addition to initial 15 million files, 8K static files will be generated per day and added into NFS

While, I don't believe that this is a good solution and don't feel comfortable implementing this and looking into Varnish to cache mostly accessed articles, I'd like to know what others think about the proposed solution vs varnish.

Questions:

  • Can 15 million files (4T) be stored in a single directory in Linux (CentOS)
  • Can such a large directory be shared via NFS? Will that be stable
  • Can 15 million files be stored in hashed directories ? or still bad idea?
  • Is there a max file limit for NFS share?

I'd like to thank you for your advise in advance.

Nerses
  • 709
  • 2
  • 9
  • 15
  • Possible duplicate of http://stackoverflow.com/questions/466521/how-many-files-in-a-directory-is-too-many – AbsoluteƵERØ Apr 19 '13 at 00:52
  • 1
    Just use varnish. It's much simpler and more powerful than writing your own caching layer. And stay away from NFS unless you have *no other choice*. – Dave S. Apr 19 '13 at 01:27

1 Answers1

0

You can give GlusterFS a try.

First partition your articles by category. Then store them to a GlusterFS directory like: /mnt/articles/category1/201304/20130424/{a lot of files}

I have a 6 nodes GlusteFS cluster to store log files. Currently it has 8T+ files and increase 30G+ everyday without any problem.

MMM
  • 7,221
  • 2
  • 24
  • 42
Kevin Leo
  • 850
  • 1
  • 7
  • 9