1

I'm planning to store gigantic arrays into serialized files and read from them in order to display data within them. The idea is to have a simple, document-oriented, filesystem database. Can anyone tell me if this would be a performance issue? Is it going to be slow or very fast?

Is it worth, filesystem is always really faster?

NikiC
  • 100,734
  • 37
  • 191
  • 225
Jamie
  • 11
  • 1
  • 1
    More information is needed. The filesystem is not always (or most of the time even) faster than using an SQL or NoSQL db. It all depends on your usecase. So if you can figure out what your exact usage requirements are, you can figure out which is right... – ircmaxell Oct 27 '10 at 16:39
  • 3
    Define "gigantic"! thousands of entries, tens of thousands, millions, bigger? multidimensional arrays? Serialization/deserialization is slow when working with large arrays, so being able to read a section of an array is probably better. – Mark Baker Oct 27 '10 at 16:41
  • 2
    possible duplicate of [What are some good, fast persistant storage options for key->value data?](http://stackoverflow.com/questions/3972362/what-are-some-good-fast-persistant-storage-options-for-key-value-data) – Gordon Oct 27 '10 at 16:43
  • I'd be more inclined to do a `var_export` (less overhead in loading then `unserialize`). – Wrikken Oct 27 '10 at 16:48
  • 1
    What does "gigantic" mean to you? 1 MB, 1 GB, 1 TB? How many files are you going to have? Thousands? Millions? All the folks telling you to use a real database or a key-value store are probably right, but the details matter when you get to extremes. – xscott Oct 27 '10 at 16:50
  • I'm thinking about, MAX, 5,000 entries. – Jamie Oct 27 '10 at 19:37

4 Answers4

3

It will be very slow. Serializing and unserializing always requires reading and processing the whole array, even if you need only a small part.

Thus you are better of with using a database (like MySQL). Or if you need to only access key/value use APC/memcached.

NikiC
  • 100,734
  • 37
  • 191
  • 225
  • But that's not how a filesystem database works... It uses directories and files to organize the data instead of an array or other complex data structure. So there's no serialization/deserialization needed (well, above what you would need for MySQL/etc anyway)... I'm not arguing that it's right to implement, but for some use cases (LARGE amounts of sparse data, needing access from shell scripts, etc) it **may** make sense (now, these are few and far between). But it's by no means "very slow"... – ircmaxell Oct 27 '10 at 16:47
  • @ircmaxell: Well the OP talks about serializing large arrays to the filesystem. That certainly doesn't make sense. He doesn't say anything about directories and files. – NikiC Oct 27 '10 at 16:50
  • whoops. I must have skipped right over that... I still think the comment is valid (aside from that one line) otherwise. But I agree, there's no reason to store complex data structures in a serialized file yourself unless you're lazy... – ircmaxell Oct 27 '10 at 16:53
3

You'll be much better off using a "proper" database - it's what they're designed for. If your data is really in a document-oriented format, consider CouchDB.

Craig A Rodway
  • 675
  • 3
  • 7
  • Unfortunately, i don't want to install anything. I need something that runs just with PHP. Simple and fast. – Jamie Oct 27 '10 at 16:48
  • The PHP-only solution may be simple. But fast entirely depends on your dataset, operating system, filesystem, and disk architecture. – Craig A Rodway Oct 27 '10 at 16:50
0

I think you could implement this without many performance issues, just so long as your arrays don't take forever to (un)serialize and you are able to lookup your files efficiently. How do you plan on looking up which file to read, btw?

Is it worth, filesystem is always really faster?

No, this method is not always faster, in fact you'd probably get better performance using some sort of db or cache with what you're trying to do.

Parris Varney
  • 11,320
  • 12
  • 47
  • 76
0

Quite big, a multidimensional array with +- 5,000 entries.

Jamie
  • 1