6

I just came from the relational database school and dealing with JSON databases is not an easy task for new comers. I have this structure to store users:

{
  "users" : {
    "0CcKvNkOm5fVqL" : {
      "birthday" : 564688000,
      "country" : "US",
      "email" : "email@live.com",
      "firstName" : "John",
      "gender" : "male",
      "isOnline" : true,
      "lastLoginDate" : 1468166460486,
      "lastName" : "Paul",
      "learningLanguages" : [ {
        "language" : "fr_FR",
        "levelID" : 2
      } ],
      "profileImage" : "https://firebasestorage.googleapis.com/image.jpg",
      "providerID" : "Firebase",
      "registrationDate" : 1468168460486,
      "speakingLanguages" : [ {
        "language" : "es_ES",
        "levelID" : 7
      } ]
    }
  }
}

I offer a search screen in my app where users can search for other users and they can combine all these filter parameters:

Example:

Get 10 users starting from index 0 who are:

  • male
  • and from "US"
  • and speaks "da_DK" with levelID 2 or/and "fr_FR" with any level
  • and learns "de_DE"withlevel 1**and/or**learns "ar_AR"withlevel 4`
  • and with age range between 18 and 24
  • and order by isOnline and last login date.

This is an easy task with SQL, when supposing that there is a table called users_languages:

SELECT ...
FROM users AS u
JOIN users_languages AS l
  ON u.id = l.id
WHERE u.gender = "male" 
AND u.age BETWEEN 18 AND 24 // need claculation but let's keep it simple
AND u.country = "US"
AND ((l.language = "de_DE" AND l.mode = "learning" AND l.level = 1) OR (l.language = "ar_AR" AND l.mode = "learning" AND l.level = 4))
....
ORDER BY isOnline, lastLoginDate DESC
LIMIT 0,10

My questions:

  1. How can I build the query above with Firebase with the actual structure
  2. If it's not possible how to improve my database structure for my specific use case (to be able to handle the query above in a better way)
iOSGeek
  • 5,115
  • 9
  • 46
  • 74
  • Recommended reading: [NoSQL data modeling](https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/) and [my answer here](http://stackoverflow.com/questions/26700924/query-based-on-multiple-where-clauses-in-firebase) – Frank van Puffelen Jul 10 '16 at 23:36
  • Two things: please post your Firebase structure as text, no images. It makes it easier to copy/paste into an answer and also makes it searchable. Firebase console has an export feature. Also, how many users would you have online at any given time? – Jay Jul 10 '16 at 23:38
  • @Jay thx, I have updated my question with text structure, and I am expecting maybe as a first step hundreds or dozens .. but it would be great if the solution can be scalable, any suggestions ? – iOSGeek Jul 11 '16 at 07:24
  • @FrankvanPuffelen even reading the document and your previous answer does not seem to give me a an answer for my case :( and help would be appreciated – iOSGeek Jul 11 '16 at 17:41
  • 2
    The snapshot contains the information needed to return the children in the correct order. But simply printing the snapshot likely converts it to a dictionary, which loses this ordering information. See http://stackoverflow.com/questions/38266429/firebase-swift-use-indexon-to-sort-data-snapshot/38269746#38269746 – Frank van Puffelen Jul 11 '16 at 23:41
  • @FrankvanPuffelen the simple order works but it would be great if you can give more information about how to structure my database and a query example to handle my query above which is possible by SQL, I have updated the question, I am really **stack** I even gived away the half of my reputation as a bounty for this question! – iOSGeek Jul 12 '16 at 22:28
  • You're trying to model a SQL data model in a NoSQL database. No bounty can solve that problem. The answer linked gives one approach that *might* give a solution. Try it and show how far you get. Without seeing that from your side this question is really a duplicate of http://stackoverflow.com/questions/26700924/query-based-on-multiple-where-clauses-in-firebase – Frank van Puffelen Jul 13 '16 at 00:43
  • @FrankvanPuffelen please are you sure that the approaches in your linked answer is applicable in my situation because I only see that 2 and 3 are not applicable because I have multiple parameters I once found that wether I fetch all users and filter in client side (not scalable) or use elastic search+Firebase (lose the fact that we use Firebase to not deal with backend stuff) am I wrong – iOSGeek Jul 14 '16 at 17:43

1 Answers1

11

The to-the-point answer is: You can't do this type is search in Firebase.

Let me present a wall-of-text answer that will hopefully lead to a solution.

To Be Quite Frank: As Frank had mentioned in his on-point comments and links, leveraging other products such as ElasticSearch can be one solution. While they do provide scalability, it adds another product to the equation. I would suggest exploring those options further.

Filtering is Cool: A second solution is to filter in code. While this is a good solution for a couple thousands records, it's not scalable for tens/hundreds of thousands of records. However, this is the best solution if you have a complex data structure and a limited amount of data.

On that note, you can filter in code even with millions of records if the UI is structured to make it work. Decide on one to or two main searches, gender for example. Then perform a query for all females. That cuts your dataset by half and is much more manageable in code. You can also further reduce your dataset - see next section.

Change is good: Another option is to structure your data to match the types of queries you will be performing. For a simple example: Suppose you have three items you want to query on; gender_country_age

Your Firebase structure would be

users
  -Jyiai09jsi
    data: "male_US_40"
  -Jqkjisjida
    date: "male_US_27"
  -JyHYjlkall
    data: "male_US_30"

Then to query for all male users in the US between the ages of 30 and 40

usersRef.queryOrderedByChild("data").queryStartingAtValue("male_US_30")
        .queryEndingAtValue("male_US_40").observeSingleEventOfType(
       .Value, withBlock: { snapshot in
    print(snapshot)
})

The upside here is that's it's scalable but the downside is you can't query for just US users. On the flip side, this is a much smaller dataset which you could further filter in code.

Duplicate data is your friend: The good news is there's a solution for that as well: disk space is cheap so duplicate your data

user_countries
   US
       -Jyiai09jsi: true
       -Jqkjisjida: true
       -JyHYjlkall: true
   UK
      etc etc

user_gender
   male
       -Jyiai09jsi: true
       -Jqkjisjida: true
       -JyHYjlkall: true
   female
       etc etc

user_speaks
   da_UK
      users
   fr_FR
      users

This structure give you super quick access to data groups; countries, gender etc. I used true here as a placeholder but technically you could have each users node in that spot as well. But, that would again be reading in a lot of data during the query; a bunch of 'true' nodes is a pretty tiny amount of data, even with thousands of nodes.

SQL ftw! Something else to consider is how you are using the asynchronous nature of Firebase. Do you really need to have that data stored in Firebase or can you store that data in another cloud based SQL server for queries and store links to that data within Firebase. That way you can SQL query to your hearts content and then use Firebase for messaging, updates etc.

Final thought Your best bet if you want to to these kinds of searches is to structure your data in a way that reduces it's footprint as quickly as possible, then filter the rest in code. Imagine having a million records and then querying for male_US_30_FR. Now you have a couple of thousand records that's easily loaded and further filtered in code

I hope one or a combination of these helps.

Jay
  • 34,438
  • 18
  • 52
  • 81
  • 1
    Thanks for another awesome answer Jay! – Frank van Puffelen Jul 20 '16 at 21:23
  • Nowadays, Firebase Firestore allows for these advanced querying options! – Eric Mar 10 '22 at 19:32
  • 1
    @Eric So true! Firestore has much more powerful queries. However, if you look closely at the original question, the query contains OR statements on multiple fields which even today, is challenging for Firestore. Also note *If you include a filter with a range comparison your first ordering must be on the same field* and the ordering in the question is on *isOnline* and *last login date* but would need to be by age since that's the range parameter. – Jay Mar 10 '22 at 20:19