0

EDIT: This post was marked as "duplicate," which I think missed my original point. So I have clarified my question, and I hope it will be re-opened.

I recently came across a situation in which I needed random access to a collection of Python objects. The collection did not fit in memory, so I consulted the Python documentation on object persistence.

The documentation mentions two options for non-serial persistence: shelve and sqlite3.

  • shelve is a Python-specific library that works with some kind of Unix built-in. (I'm not familiar with the specifics.)
  • sqlite3 is a widely supported, highly dependable mini-database that is used all over the place.

So my first reaction was to use sqlite3, as it's highly flexible and has a rich ecosystem. But according to another SO discussion, storing pickled Python objects in sqlite3 is considered very bad form.

In the question marked above, it is noted that pickling data into sqlite3 prevents queries within that data. However, there's nothing stopping me from adding two, three, or many columns by means of which I could retrieve cross-sections of the data. shelve does not enable this.

To me, this presents a whole range of use cases that shelve simply does not offer. Nevertheless, in the post noted above, it is claimed that "doing this sort of thing is indicative of bad design," as well as a host of other judgmental statements.

My question is: why is this? Why is doing this a sign of unhealth? I can think of numerous use cases where I might want the flexibility and capacity of sqlite3 without the overhead of a full-fledged database system.

David Bruce Borenstein
  • 1,655
  • 2
  • 19
  • 34
  • The other answer **already explains this**; you can't use SQL's features to query data based on what's *inside* the pickle data. It is otherwise just an opinion. – Martijn Pieters Apr 10 '19 at 18:27
  • (`shelve` stores pickles, in a [DBM file](https://en.wikipedia.org/wiki/DBM_(computing)), but DBM doesn't support any kind of querying anyway). – Martijn Pieters Apr 10 '19 at 18:28
  • 1
    Using `sqlite3` for this doesn't really offer much advantage over `shelve` (which does the same thing you're proposing, just with a different database engine, `dbm`) and `dbm` is simpler for this use case. – kindall Apr 10 '19 at 18:30
  • Martijn, thanks, but that wasn't really the question. As you say, shelve also doesn't permit querying, but I sqlite3 gives me greater flexibility in the kinds of retrieval strategies I can use. I am going to revise the question, and I hope you will re-open it as I do not think that it's a duplicate. – David Bruce Borenstein Apr 10 '19 at 21:56
  • 1
    @DavidBruceBorenstein the question is based on a misunderstanding. The duplicate *is* a duplicate of what you asked. Your comments show that you want something different though. sqlite3 is a relational database, not a file format. You use it because you want an embedded relational database with indexing and transactions instead of eg MySQL or PostgreSQL. It's still a relational database though. If you want to store objects, you map them to tables with an ORM so you can query the data. – Panagiotis Kanavos Apr 11 '19 at 07:45
  • @DavidBruceBorenstein there are several questions about Python ORMs and SQLite. For example [Python lightweight database wrapper for SQLite](https://stackoverflow.com/questions/9026069/python-lightweight-database-wrapper-for-sqlite) mentions peewee and SQLAlchemy – Panagiotis Kanavos Apr 11 '19 at 07:49
  • 1
    @DavidBruceBorenstein: The author of SQLite states in the documentation that SQLite was created for just this purpose, so it's perfectly reasonable to use use it as you have envisioned. However, many people now consider that SQLite is an RDBMS *only*, and their opinion is that, as such, it shouldn't be used for purposes like this. So even your re-formulated question is going to elicit opinion more than answers, so it would still be off-topic. – Mark Benningfield Apr 11 '19 at 23:24

0 Answers0