7

Let's say you have an Person object, with several fields like first_name, last_name, age which are relatively small, and several large fields like life_story.

Most calls to retrieve Person objects do not require returning the life_story, so we would rather not return it on all calls to the Person endpoint. On the other hand, when POSTing a new Person, we would like to allow the client to include the life_story field.

One option would be to have a Person endpoint and a PersonDetailed endpoint, where all calls (GET/POST/PUT) to the Person do not handle the life_story field, and all calls to the PersonDetailed require all fields.

Finally we could fudge it and make POST and PUT methods on Person to allow clients to optionally include the life_story, but to not return it when making GET calls to endpoints like

API/Person/?last_name_like=La

I'm not a fan of having GET, POST and PUT methods on the same endpoint return objects with different fields, but it does keep the API simpler.

I've been looking for examples of how people deal with issues like this, but have not found any. Can anyone point to an article or book that discusses issues like this?

bpeikes
  • 3,495
  • 9
  • 42
  • 80
  • 1
    You should use content negotiation. See http://stackoverflow.com/questions/7846900/rest-api-having-same-object-but-light/7853235#7853235 – Will Hartung Feb 19 '16 at 01:30
  • @WillHartung, content negotiation works if we have a limited number of content formats, e.g. _lite_ vs. _full_; but what if the content has hundreds of fields and we want to allow the client to select any combination of those? – jaco0646 Feb 19 '16 at 17:02
  • I am under the impression that content negotiation is for format, i.e. json vs. xml, not for representations given a particular encoding. – bpeikes Feb 19 '16 at 20:47

4 Answers4

8

As requested by @jaco0646

TL;DR

  • core user resource with embedded sub-resoruces like address, groups, posts or pm. (/api/v1/users/{user_uuid})
  • users will also contain an embedded resource called views which handles the currently registered views (/api/v1/users/{user_uuid}/views/{some_view})
  • A view is created using POST request (i.e. from a HTML form) including the selected sub-resources
  • Each view contains the core user data and the data for the selected fields
  • Partial GET request may be used if all views start with the core user data to only download the required data; though may have its limits

Issues with current answers

Before I post my approach to tackle the filtering on certain properties, I want to give a quck insight why I do not agree with the currently given answers by @jaco0646, @Yoram and @JoseMartinez (which are all rather the same IMO)

Caching of response content

HTTP tries to reduce the network overhead by cacheing responses. A second lookup for the same resource should in best case result in a lookup from the local cache instead of actually querying and downloading the result from the server directly. This is especially helpful if the resource data does not change often.

With certain cache-control header and If-Modified-Since request header a client can take influence on wheter to use a cached content or refresh the cache by loading the current content and cache the response instead. However, GET requests with query parameter are often said to be excluded from caching, which is more of an urban legend than the actual truth. Certain implementations, however, may avoid caching of such resources though. By RFC 7234 a cache should use the effective request URI to reconstruct stored responses, which by default is the target URI including any query, matrix and path parameters. As such the whole URI is considered to be the key used to store and access responses.

Partial GET request and use-cases

As mentioned by jaco in his post, the HTTP Protocol defines, besides the standard GET and conditional GET request, also a partial GET request which allows a client to request only a part of a resource instead of the full resource.

While this may sound great to start with, a partial GET request however has, at least in HTTP/1.1, the limitation that it only works on bytes.

The only range unit defined by HTTP/1.1 is "bytes".

The Range header allows to add multiple byte segments to the request to include multiple segments within the response:

GET /someResource HTTP/1.1
Host: http://some-host.com/
Range: 500-700,1200-

The partial request asks to download only the bytes between (and including) 500-700 and everything from byte 1200 till the end.

Usually a partial GET request is used to resume a broken download or for buffering a running stream as the exactly downloaded bytes are already known. But, how do you specify in advance the byte ranges of each filter-field? Without a-priori knowledge I don't think this will work.

URL size limitation

In case there are many fields which may be available for filtering, using a GET request with query or matrix parameter may cause certain browser issues as some browsers have a limitation of 2000 characters.

While this may not have an impact on the OPs issue, an other user who requires exhaustive filtering properties may run into this issue though.

Resources and sub-resources

ReST focus is on resources and the methods HTTP protocol offers to interact with them.

A user-resource i.e. has certain "core" data like the user name, an id and maybe other domain specific things. But it also has additional data like the address, ... which may be part of the user resource as well.

Instead of mingling every property into a single entity, ReSTfull applications try to have plenty of resources. Like in the sample above user and address are just two to name but there are many more for sure. If you start working on a ReSTfull design it might not be clear if certain data should be part of this resource or refactored to its own resource. Here a rule of thumb is, if you need certain data in at least two different resources refactor it and embedd it within these resources.

Dividing larg(er) resources into a hirarchy allows to easily update (in the pure HTTP sense of replacing what is available currently at resource X with the new content) sub-resources in case of changes (like an address change of a user) while having one big resource to handle all data requires to send the whole entity body (if used properly) to the server instead of only the change.

Entity formats

Plenty of "ReSTfull" services exchange data in application/xml or application/json format. However, both do not convey much semantic. They just lay out the used syntax rules which might be validated on client side. But they do not give any hint on the actual content. Therefore a client has to have also a-priori knowledge on how to process data received in one of these formats.

If JSON is the representation format of your choice, I'd use JSON HAL (application/hal+json) instead as this defines core data, links and embedded content which is quite usefull especially for the presented scenario IMO.

Proposed solution

The proposed approach has a core user resource which embedds the certain sub-resoruces like address, groups, posts or pm. It will also contain an embedded resource called views which handles the currently registered views for either a user or for users in general. A view is created by sending a POST request (i.e. from a HTML form) including the selected sub-resources to include within the response.

The core resource is a user resource, which might be available at /api/v1/users/{user_uuid} and by default only includes the user core data and links to the other resources

{
    "firstName": "Maria",
    "lastName": "Sample",
    ...
    "_links": {
        "self": {
            "href": "/api/users/1234-5678-9123-4567"
        },
        "addresses": [
            { "href": "/api/users/1234-5678-9123-4567/addresses/abc1" }
        ],
        "groups": [
            { "href": "/api/users/1234-5678-9123-4567/groups" }
        ],
        "posts": [
            { "href": "/api/users/1234-5678-9123-4567/posts" }
        ],
        ...
        "views: [
            { "href": "/api/users/1234-5678-9123-4567/views/view-a" },
            { "href": "/api/users/1234-5678-9123-4567/views/view-b" }
        ]
    }
}

Any sub-resource is available via the users resource URI: /api/v1/users/1234-5678-9123-4567/{sub_resource}, where sub_resource may be one of the following: addresses, groups, posts, ...

The actual sub-resource for an address i.e. may look like this

{
    "street": "Sample Street"
    "city": "Some City"
    "zipCode": "12345"
    "country": "Neverland"
    ...
    "_links": {
        "self": {
            "href": "/api/v1/users/1234-5678-9123-4567/addresses/abc1"
        },
        "googleMaps": {
            "href": "http://maps.google.com/?ll=39.774769,-74.86084"
        }
    }
}

while the user has two posts like these

{
    "id": 1;
    "date": "2016-02-21'T'14:06:20.345Z",
    "text": "Lorem ipsum ...",
    "_links": {
        "self: {
            "href": "/api/users/1234-5678-9123-4567/posts/1"
        }
    }
}

{
    "id": 2;
    "date": "2016-02-21'T'14:34:50.891Z",
    "text": "Lorem ipsum ...",
    "_links": {
        "self: {
            "href": "/api/users/1234-5678-9123-4567/posts/2"
        }
    }
}

A view (/api/users/1234-5678-9123-4567/views/view-a) which contains addresses and posts may look like this:

{
    "firstName": "Maria",
    "lastName": "Sample",
    ...
    "_links": {
        "self": {
            "href": "/api/users/1234-5678-9123-4567"
        },
        "addresses": [
            { "href": "/api/users/1234-5678-9123-4567/addresses/abc1" }
        ],
        "groups": [
            { "href": "/api/users/1234-5678-9123-4567/groups" }
        ],
        "posts": [
            { "href": "/api/users/1234-5678-9123-4567/posts" }
        ],
        ...
        "views: [
            { "href": "/api/users/1234-5678-9123-4567/views/view-a" },
            { "href": "/api/users/1234-5678-9123-4567/views/view-b" }
        ]
    },
    "_embedded": {
        "addresses:" : [
            {
                "street": "Sample Street"
                "city": "Some City"
                "zipCode": "12345"
                "country": "Neverland"
                ...
                "_links": {
                    "self": {
                        "href": "/api/v1/users/1234-5678-9123-4567/addresses/abc1"
                    },
                    "googleMaps": {
                        "href": "http://maps.google.com/?ll=39.774769,-74.86084"
                    }
                }
            }
        ],
        "posts": [
            {
                "id": 1;
                "date": "2016-02-21'T'14:06:20.345Z",
                "text": "Lorem ipsum ...",
                "_links": {
                    "self: {
                        "href": "/api/users/1234-5678-9123-4567/posts/1"
                    }
                }
            },
            {
                "id": 2;
                "date": "2016-02-21'T'14:34:50.891Z",
                "text": "Lorem ipsum ...",
                "_links": {
                    "self: {
                        "href": "/api/users/1234-5678-9123-4567/posts/2"
                    }
                }
            }
        ]
    }
}

An other view (i.e. /api/users/1234-5678-9123-4567/views/view-b) may only include posts done by the selected user:

{
    "firstName": "Maria",
    "lastName": "Sample",
    ...
    "_links": {
        "self": {
            "href": "/api/users/1234-5678-9123-4567"
        },
        "addresses": [
            { "href": "/api/users/1234-5678-9123-4567/addresses/abc1" }
        ],
        "groups": [
            { "href": "/api/users/1234-5678-9123-4567/groups" }
        ],
        "posts": [
            { "href": "/api/users/1234-5678-9123-4567/posts" }
        ],
        ...
        "views: [
            { "href": "/api/users/1234-5678-9123-4567/views/view-a" },
            { "href": "/api/users/1234-5678-9123-4567/views/view-b" }
        ]
    },
    "_embedded": {
        "posts": [
            {
                "id": 1;
                "date": "2016-02-21'T'14:06:20.345Z",
                "text": "Lorem ipsum ...",
                "_links": {
                    "self: {
                        "href": "/api/users/1234-5678-9123-4567/posts/1"
                    }
                }
            },
            {
                "id": 2;
                "date": "2016-02-21'T'14:34:50.891Z",
                "text": "Lorem ipsum ...",
                "_links": {
                    "self: {
                        "href": "/api/users/1234-5678-9123-4567/posts/1"
                    }
                }
            }
        ]
    }
}

On invoiking /api/users/1234-5678-9123-4567/views you may show a list of currently available views and also a HTML form (or some custom UI) where you have checkboxes for each available field you want to include or exclude. On sending the form data to the server, it will check if for the given properties already a view exists (if so 409 Conflict) and creates a new view which might be reused later. You might also name the views and include certain selected properties within the views segment within the _links section.

Instead of specifying a view per user, you can also create a general view once for all users and reuse them to your will.

As the views have no query parameters the whole response is cacheable. As you create a view using a POST request (if idempotancy is an issue use an empty POST request followed by a PUT request) you are good to go for almost infinite parameters. This HAL similar dialect uses its own logic for views. It, therefore, might be also a good idea to create an own content type like: application/vnd+users.views+hal+json

Concerning partial GET requests:

As the core user data is the same for every view, it might be possible to use the length of the core data (minus the closing bracket and any whitespace characters after the second last bracket) and issue a partial GET request to the server. It should respond with only the embedded data (and the final closing bracket), though I'm not sure if current browsers are actually able to update the current data accordingly, especially if certain bytes of the known content need to be removed like the final bracket of the core user data.

Community
  • 1
  • 1
Roman Vottner
  • 12,213
  • 5
  • 46
  • 63
  • With the /user/{id}/views methodology, how would you request viewA for all users with last name = Last? – bpeikes Aug 14 '16 at 04:12
  • @bpeikes sorry for responding so late, but I probably missed the notification on the question. The proposed solution showcases only a sinlge-user case. As the search term might hit multiple users a collection of data might be returned. IMO it is better in that case to return i.e. JSON array that just contains basic information like the name and a link directly pointing to the view of that user, i.e. viewA. If the clients wants to retrieve more detailed information it simply can invoke that view directly – Roman Vottner Jul 17 '18 at 14:39
0

Use query parameters.
api/people?fields=first_name,last_name,age

Using the ?fields= syntax is simple to read; a client can select just the information needed at a given time.

On a somewhat-related note, HTTP also includes support for partial content requests, denoted by the 206 response code. You could potentially provide part of the life_story without returning all of it.

jaco0646
  • 15,303
  • 7
  • 59
  • 83
  • This method requires a-priori knowledge of the available data. In a true RESTful environment, a client should start from an entry URL (like the domain of the page) and progress through the content using the given links in the response. Also client and server should rely on content negotiations to retrieve content in the desired format. While `application/xml` or `application/json` have hardly any semantic value, something like `application/vnd+compX.userDataShort+xml` or `application/vnd+compX.userDataComplete+json` may convey plenty more semantics. – Roman Vottner Feb 19 '16 at 03:08
  • Also, `206` has a slightly different meaning. It should be returned for [a partial GET request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.3) which has to include a [`Range`](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35) header field (in bytes). A good use case for `206` can be found [in this article](https://benramsey.com/blog/2008/05/206-partial-content-and-range-requests/) – Roman Vottner Feb 19 '16 at 03:24
  • @RomanVottner, you should add an answer to this question that demonstrates partial response queries in a ReSTful manner. Content negotiation makes sense for formatting, which generally provides a very limited number of options; but are you suggesting that every possible combination of fields should be represented by its own format? – jaco0646 Feb 19 '16 at 16:51
  • The main issue we have with something like a fields parameter, is that means you can't round trip your objects. An object returned by a GET, with participial fields, could not be used in a PUT, because some fields are missing. – bpeikes Feb 19 '16 at 20:52
  • @bpeikes, that is not a requirement of ReST, to my knowledge. I don't see any impediment to [hateoas](https://en.wikipedia.org/wiki/HATEOAS) in a partial response. – jaco0646 Feb 19 '16 at 22:39
  • @jaco0646, I understand that it might not be a requirement, but it makes for an API which is much simpler to develop. You dont have to check for every combination of required fields. – bpeikes Feb 22 '16 at 12:40
0

I like the advice given in this article.

Use a fields query parameter that takes a comma separated list of fields to include. For example, the following request would retrieve just enough information to display a sorted listing of open tickets:

Jose Martinez
  • 11,452
  • 7
  • 53
  • 68
  • Is this any different from the answer I gave, and does it address the comment from @RomanVottner? – jaco0646 Feb 19 '16 at 14:56
  • Yes. "Can anyone point to an article or book that discusses issues like this?" That was the question I was answering. – Jose Martinez Feb 19 '16 at 17:15
  • I was hoping to avoid answering that question directly, because that would necessitate this thread being closed as off-topic. "_Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow..._" – jaco0646 Feb 19 '16 at 18:22
0

OData protocol provide a very comprehensive RESTful API. The common way to do CRUD (Create, Retrieve, Update, Delete) is done by POST, GET, PUT, and DELETE respectively.

Request for partial resource is done by adding select query:

api/Person?$select=first_name,last_name,age
Yoram
  • 572
  • 6
  • 21