4

So, I'm sure this must have been asked before but I can't seem to find anything. The thing is that when I'm programming search features for web apps, it never feels quite right to me.

I'm using Ruby on Rails, but I guess this is a question that applies to any situation in which you are using a RESTful MVC pattern.

Let's say you have a Resource (e.g. Users, ToDos, …) you want to search through. As soon as the app grows this won't be feasible with simple LIKE-queries anymore and you start using an index (e.g. Solr, ElasticSearch, Lucene, …). The indexed resource also tends to be compound data from the Resource and it's associated objects (a User's location, the ToDos creator, …).

How do we best represent this?

  • Is it a GET to /resources (Resource#index)? It is a selective list of the main resource, but then again it's actually this compound thing and if the search functionality is extensive it really tends to bloat up the Model's code.
  • Is it a POST to /searches (Search#create)? We are creating a search but are not saving it. Instead it kind of gets transformed into a set of SearchResults.
  • So, is it a GET to a SearchResult (SearchResult#show)? But it doesn't have an ID. I guess the SearchIndex is kind of the database for that model, but you wouldn't really create a SearchResult, right? It's more of a Search#create that ends in a SearchResult#show but that also feels wonky to me.
KonstantinK
  • 757
  • 1
  • 8
  • 23
  • After a little more thought about this topic I realize that this really doesn't have anything to do with MVC. We are talking mainly about the best way to represent searches in REST, no matter the underlying data architecture. As suggested below, `GET`ing resources and filtering them by parameters seems to be the common way to go about this. And if MVC is the only architectural abstraction layers one uses then yes, they tend to get bloated. BTW without trying to bring MVC into the picture my first sentence proved correct: [RESTful URL design for search](http://stackoverflow.com/q/207477/784889) – KonstantinK Jul 05 '15 at 15:39

1 Answers1

1

Usually using POST for search-operations is not really recommended as you lose all advantages GET has to offer - semantics, idempotency, saftyness (cacheability), ...

Many RESTful and REST-like systems use simple GET queries with search parameters as either query or path parameters to allow client- and server-based caching of queries and results. Since HTTP 1.1. caching of GET requests which contain query-parameters isn't an issue unless caching headers are specified correclty.

But predefined queries have a smell of LIKE queries which you try to avoid. Especially ElasticSearch allows to add new fields to types dynamically. This might introduce new overhead to keep up with adding new predefined filters to support queries for these fields. Therefore, adding queries dynamically as needed is probably a base requirement on the long run. This isn't all to hard to achive though.

A sample output for a GET /users/12345 query which contains dynamically added search filters might therefore look like this:

{
    "id": "12345",
    "firstName": "Max",
    "lastName": "Test",
    "_schema": {
        "href": "http://example.com/schema/user"
    }
    "_links": {
        "self": {
            "href": "/users/12345",
            "methods": ["get", "put", "delete"]
        },
        "curies": [{ 
            "name": "usr", 
            "href": "http://example.com/docs/rels/{rel}", 
            "templated": true
        }],
        "usr:employee": {
            "href": "/companies/112233",
            "title": "Sample Company",
            "type": "application/hal+json"
        }
    },
    "_embedded": {
        "usr:address": [
            {
                "_schema": {
                    "href": "http://example.com/schema/address"
                },
                "street" : "Sample Street",
                "zip": "...",
                "city": "...",
                "state": "...",
                "location": {
                    "longitude": "...",
                    "latitude": "..."
                }
                "_links": {
                    "self": {
                        "href": "/users/12345/address/1",
                        "_methods": ["get", "post", "put", "delete"],
                    }
                }
            }
        ],
        "usr:search": {
            "_schema": {
                "href": "http://example.com/schema/user_search"
            }
            "_links": {
                "self": {
                    "href": "/users/12345/search",
                    "methods: ["post", "delete"]
                }
            },
            "filters": [
                "_schema": {
                    "href": "http://example.com/schema/user_search_filter"
                },
                "_links": {
                    "self": {
                        "href": "/users/12345/search/filters",
                        "methods: ["get"]
                    },
                    "next": {
                        "href": "/users/12345/search/filters?page=2"
                        "methods: ["get"]
                    }
                },
                {
                    "byName": {
                        "query": {
                            "constant_score": {
                                "filter": {
                                    "term": {
                                        "name": {
                                            "href": "/users/12345#name"
                                        }
                                    }
                                }
                            }
                        }
                        "_links": {
                            "self": {
                                "href": "/users/12345/search/filter/byName",
                                "methods": ["get", "put", "delete"],
                                "_schema": {
                                    "href": "http://example.com/schema/search_byName"
                                }
                                "type": "application/hal+json"
                            }
                        }
                    }
                },
                {
                    "in20kmDistance" : {
                       "query": {
                           "filtered" : {
                               "query" : {
                                   "match_all" : {}
                               },
                               "filter" : {
                                   "geo_distance" : {
                                       "distance" : "20km",
                                           "Location" : {
                                               "lat" : {
                                                   "href": "/users/12345/address/location#lat"
                                               },
                                               "lon" : {
                                                   "href": "/users/12345/address/location#lon"
                                               }
                                           }
                                       }
                                   }
                               }
                           }
                        }
                        "_links": {
                            "self": {
                                "href": "/users/12345/search/filter/in20kmDistance,
                                "methods": ["get", "put", "delete"],
                                "_schema": {
                                    "href": "http://example.com/schema/search_in20kmDistance"
                                }
                                "type": "application/hal+json"
                            }
                        }
                    }
                },
                {
                    ...
                }
            ]
        }
    }
}

The example-code above contains a user representation with embedded address and search filters in an extended JSON HAL format. As RESTful resources should be as self-explanatory as posible, the sample contains links to their location and to their schema so that post and put operations also know what fields the server might need.

The search resource acts as a controller for filters in that it only allows to add new filters or delete all of them at once, while iterating through a filter page is achieved by invoking GET on /users/{userId}/search/filters?page=pageNo.

An actual filter now contains the actual instruction to execute - in this case an ElasticSearch query for either the name of the user or for everything in 20km distance of the current address - as well as a link to the actual URI which executes the query. Note that the ElasticSearch code actually contains a link to the resource containing the data the actual query should use. Of course it would be possible to return a valid ElasticSearch query containing the actual user data or even a JSON Pointer instead of URIs to the data as well - this is again some implementation detail.

This approach allows to add new queries or update existing queries at runtime dynamically while also keep the GET semantics at query-time intact. Furthermore, cacheing capabilities can also be utilized which may improve performance significantly - especially if user data does not change often.

Drawback to this approach however is, that you have to return more data on user lookups. You can also consider not to return embedded filters and have a client poll these explicitely. Furthermore, currently filters are added by a certain name which acts as key. In practice this may leed to naming-clashes. Eventually UUIDs are better therefore but also take away semantics if humans have to invoke those URIs as byName has certainly more semantic to a human than de305d54-75b4-431b-adb2-eb6b9e546014 but this is more of an implementation detail.

Community
  • 1
  • 1
Roman Vottner
  • 12,213
  • 5
  • 46
  • 63
  • Alright, thank you Roman. Especially for the explanation of why we can rule out `POST`. Having a search resource nested below the main resource is actually a good way to create a self-explanatory route ... although it still has the problem of implying it to be stateful. – KonstantinK Jul 05 '15 at 15:27