31

We're designing an Android app that has a lot of data ("customers", "products", "orders"...), and we don't want to query SQLite every time we need some record. We want to avoid to query the database as most as we can, so we decided to keep certain data always in memory.

Our initial idea is to create two simple classes:

  1. "MemoryRecord": a class that will contain basically an array of objects (string, int, double, datetime, etc...), that are the data from a table record, and all methods to get those data in/out from this array.

  2. "MemoryTable": a class that will contain basically a Map of [Key,MemoryRecord] and all methods to manipulate this Map and insert/update/delete record into/from database.

Those classes will be derived to every kind of table we have in the database. Of course there are other useful methods not listed above, but they are not important at this point.

So, when starting the app, we will load those tables from an SQLite database to memory using those classes, and every time we need to change some data, we will change in memory and post it into the database right after.

But, we want some help/advice from you. Can you suggest something more simple or efficient to implement such a thing? Or maybe some existing classes that already do it for us?

I understand what you guys are trying to show me, and I thank you for that.

But, let's say we have a table with 2000 records, and I will need to list those records. For each one, I have to query other 30 tables (some of them with 1000 records, others with 10 records) to add additional information in the list, and this while it's "flying" (and as you know, we must be very fast at this moment).

Now you'll be going to say: "just build your main query with all those 'joins', and bring all you need in one step. SQLite can be very fast, if your database is well designed, etc...".

OK, but this query will become very complicated and sure, even though SQLite is very fast, it will be "too" slow (2 a 4 seconds, as I confirmed, and this isn't an acceptable time for us).

Another complicator is that, depending on user interaction, we need to "re-query" all records, because the tables involved are not the same, and we have to "re-join" with another set of tables.

So, an alternative is bring only the main records (this will never change, no matter what user does or wants) with no join (this is very fast!) and query the other tables every time we want some data. Note that on the table with 10 records only, we will fetch the same records many and many times. In this case, it is a waste of time, because no matter fast SQLite is, it will always be more expensive to query, cursor, fetch, etc... than just grabbing the record from a kind of "memory cache". I want to make clear that we don't plan to keep all data in memory always, just some tables we query very often.

And we came to the original question: What is the best way to "cache" those records? I really like to focus the discussion on that and not "why do you need to cache data?"

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Christian
  • 7,062
  • 9
  • 53
  • 79
  • 2
    "We wanna avoid to query database as most as we can, so we decided to keep certain data allways in memory." -- have you used Traceview to confirm this is an issue for your app? "we want some help/advice from you" -- my advice is: prove there is a problem first. You have very little RAM to work with. Building some big framework to deal with a non-existent problem would be a waste of effort. If you have demonstrated that this is an issue, I'd love a pointer to the blog post where you wrote it up, as I am always interested in performance test results. – CommonsWare Sep 09 '10 at 19:03
  • @CommonsWare: I understand your point and totally agree with you. But, as PalmOS and .NetCF developers, we already face this problem before. In PalmOS all data are in memory by design (.pdb) and there is no performance issue while getting data. In the other hande, in WM we face "the problem", and then we created this "solution" listed above. But now, in Android, we wish to do in the "right way". We want to Know if we are going to face performance problems when quering database "every time". So we decided to ask for suggestions here. Thanks anyway. – Christian Sep 09 '10 at 19:20
  • 2
    "We want to Know if we are going to face performance problems when quering database "every time". So we decided to ask for suggestions here." -- no, you did not. I wish you had. Instead, you declared a problem existed ("we don't want to query sqlite every time we need some record") and you wanted help with your solution. As with most platforms, the answer to if you are "going to face performance problems when quering database" is "it depends on the queries". Trust me, just because there is a problem on WM does not mean the same problem exists elsewhere. Use Traceview. – CommonsWare Sep 09 '10 at 19:54
  • You don't need to query database why scrolling. You can just query all data you need on activity start. – Dmitry Ryadnenko Sep 14 '11 at 09:29
  • Based on your edit, it sounds like you "need" a complex design with caching and/or an in-memory database because you personally find doing the joins too complex. Out of the two evils, I would stick with the (more performant) best practice and write my queries right, using joins. I've done something similar and it didn't turn out to be too unruly once I refactored the code and wrote methods to help with generating the query and inflating objects from Cursors. Although, if someone is set on using an in-memory database, SQLiteOpenHelper allows that by providing null as the database file name. – spaaarky21 Apr 11 '14 at 15:38
  • When you have a complex design, you can always create a `View` so simple querying is easier. (http://www.sqlite.org/lang_createview.html) Although Views are read-only in SQLite, it's still very useful. See it as a table with calculated columns, but completely up-to-date with your raw data tables. – Jelle Aug 05 '14 at 11:51

2 Answers2

69

The vast majority of the apps on the platform (contacts, Email, Gmail, calendar, etc.) do not do this. Some of these have extremely complicated database schemas with potentially a large amount of data and do not need to do this. What you are proposing to do is going to cause huge pain for you, with no clear gain.

You should first focus on designing your database and schema to be able to do efficient queries. There are two main reasons I can think of for database access to be slow:

  • You have really complicated data schemas.
  • You have a very large amount of data.

If you are going to have a lot of data, you can't afford to keep it all in memory anyway, so this is a dead end. If you have complicated structures, you would benefit in either case with optimizing them to improve performance. In both cases, your database schema is going to be key to good performance.

Actually optimizing the schema can be a bit a of a black art (and I am no expert on it), but some things to look out for are correctly creating indices on rows you will query, designing joins so they will take efficient paths, etc. I am sure there are lots of people who can help you with this area.

You could also try looking at the source of some of the platform's databases to get some ideas of how to design for good performance. For example the Contacts database (especially starting with 2.0) is extremely complicated and has a lot of optimizations to provide good performance on relatively large data and extensible data sets with lots of different kinds of queries.

Update:

Here's a good illustration of how important database optimization is. In Android's media provider database, a newer version of the platform changed the schema significantly to add some new features. The upgrade code to modify an existing media database to the new schema could take 8 minutes or more to execute.

An engineer made an optimization that reduced the upgrade time of a real test database from 8 minutes to 8 seconds. A 60x performance improvement.

What was this optimization?

It was to create a temporary index, at the point of upgrade, on an important column used in the upgrade operations. (And then delete it when done.) So this 60x performance improvement comes even though it also includes the time needed to build an index on one of the columns used during upgrading.

SQLite is one of those things where if you know what you are doing it can be remarkably efficient. And if you don't take care in how you use it, you can end up with wretched performance. It is a safe bet, though, if you are having performance issues with it that you can fix them by improving how you are using SQLite.

hackbod
  • 90,665
  • 16
  • 140
  • 154
5

The problem with a memory cache is of course that you need to keep it in sync with the database. I've found that querying the database is actually quite fast, and you may be pre-optimizing here. I've done a lot of tests on queries with different data sets and they never take more than 10-20 ms.

It all depends on how you're using the data, of course. ListViews are quite well optimized to handle large numbers of rows (I've tested into the 5000 range with no real issues).

If you are going to stay with the memory cache, you may want have the database notify the cache when it's contents change and then you can update the cache. That way anyone can update the database without knowing about the caching. Also, if you build a ContentProvider over your database, you can use the ContentResolver to notify you of changes if you register using registerContentObserver.

dhaag23
  • 6,106
  • 37
  • 35
  • I feel fine to know those statistics you said. I'm quite sure we're pre-optimizing, but it's because our experience in past with SQLite e mobile devices (WM). The sync between memory and database is not what we worry, because the same "time" we change something we will post it to database. The cache will be global, so all app will see the same thing. And all changes will be made first in memory and so in database (in an "atomic" operation, implemented in those classes). – Christian Sep 09 '10 at 19:39