How do I properly support partial creation of a record in a RESTful service design?

Question

I am working on designing a restful service to support an existing wizard that allows a user to submit a new resource (lets call it customer), but in pieces.

This wizard does validation for each page when the user submits it, but only for the page the user is submitting. It does a full validation pass on the entire object only when the user opts to submit the customer for final processing.

To simplify the wizard, and to allow us to shuffle the UI around in maintenance releases when we add more fields, we have not codified the wizard's structure into the resource. A customer doesn't "roll up" the way the wizard presents the data.

Is it strange to design a RESTful service in a way that named sub-documents for a resource don't necessarily hierarchically show up in the full document for that resource (or at least not in the same way)?

Say my wizard pages were:

Contact information
Food preferences
List of fears

Then here's an example customer object:

// Note that the wizard page groupings don't show up explicitly
{
    customer: {
        firstName: "Pilsner",
        lastName: "Dopplebock",
        emailAddress: "nextguest@hotelcalifornia.com",
        addressLine1: "123 Fleece Place",
        addressLine2: ""
        town: "Ibinjad",
        region: "North Dakota",
        postalCode: "12123",
        homePhoneNumber: "2123234124",
        faxPhoneNumber: null,
        meatPreference: "well-done",
        allergies: "shellfish",
        fears: [
            "banshees",
            "baths",
            "sleeveless shirts"
        ]
    }
}

Say my base URLs for the resource are:

http://www.somewhere.com/customers
http://www.somewhere.com/customers/{id}

Would it be strange or wrong to create the following restful URLs/methods, even though the customer isn't actually sub-divided the way they imply?

http://www.somewhere.com/customers/contactinformation (POST)
http://www.somewhere.com/customers/{id}/contactinformation (POST, or PUT for update? maybe GET)
http://www.somewhere.com/customers/{id}/foodpreference (POST, or PUT for update?, maybe GET)
http://www.somewhere.com/customers/{id}/fears (POST to add a single item?, maybe PUT for a batch?, maybe GET)

I had considered using an alternate wizard URL if I don't have the whole resource at one time, but in my opinion this doesn't seem properly resource-oriented:

http://www.somewhere.com/customerwizard/submitcontactinformation (POST)
http://www.somewhere.com/customerwizard/{customer-id}/submitcontactinformation
http://www.somewhere.com/customerwizard/{customer-id}/submitfoodpreference
http://www.somewhere.com/customerwizard/{customer-id}/fears

(possibly a second question, though related): Is it strange to have a count sub-property for a collection-style resource that doesn't necessarily show up on the main collection? I'd like to do this in support of paginated views...

http://www.somewhere.com/customers/count (GET)

Possibly related - http://stackoverflow.com/questions/7846900/rest-api-having-same-object-but-light?rq=1 - But it seems strange to me to pass query params to select the *type* of `POST` action I'm going to do, and thus the sub-document schema and validation that is performed... — Merlyn Morgan-Graham, Jan 23 '13 at 07:47
Found more related links - might almost be considered DUPEs. Will possibly close this as a dupe tomorrow if I feel well enough convinced: http://stackoverflow.com/questions/232041/how-to-submit-restful-partial-updates and http://stackoverflow.com/questions/2443324/best-practice-for-partial-updates-in-a-restful-service — Merlyn Morgan-Graham, Jan 23 '13 at 09:23

score 0 · Answer 1 · edited Mar 21 '21 at 00:50

I don't think this question is a duplicate of those cited. You are not asking to perform partial updates of an extant resource, but asking the server to validate "partial" resources (which I would argue are whole resources in themselves, just not ones you would normally present to the user later).

REST is not necessarily the correct option to choose in such a case. REST is designed to optimise read access of static or semi-static resources accessed multiple times by a potentially distributed audience. To help, could you answer these questions:

Would you ever want to GET the results of a validation after the initial request-response communication is complete?
Would you ever submit the same data twice (from different users or from the same user)?
If so, would the response always be the same, i.e. is the validation algorithm deterministic?
Is it okay to send all of these data unencrypted and in plain view to others?

If you answered yes to all of those, then REST is a good fit. If you answered no to all three, REST is a bad fit. Mixed answers muddy the decision somewhat.

Were I in your position, and without further data, I would implement it as an RPC interface over HTTPS to begin with, until I got an idea of what the data caching and security requirements were, so I would know which parts of the system would benefit from caching (either only on the end user's machine or as public resources transmitted unencrypted and cachable by intermediaries.

There is an awesome resource called Classification of HTTP-based APIs which might help in deciding if REST is actually the path you want to follow in your API design. Bear in mind that there is nothing "wrong" with choosing an alternative design, it is a trade-off. Make an informed decision based on the benefits and drawbacks of each on their own merits.

Thanks for that suggestion. Last night I found many similar articles comparing "kinda-REST" to HATEOAS compliant REST, and it's nice to see partial compliance codified more concretely. — Merlyn Morgan-Graham, Jan 23 '13 at 17:44
Last archived version of the linked site: https://web.archive.org/web/20150311092658/http://nordsc.com/ext/classification_of_http_based_apis.html — Merlyn Morgan-Graham, Mar 21 '21 at 00:49

score 0 · Answer 2 · answered Jan 23 '13 at 15:40

URL's like /customers/{id}/contactinformation and so on are not strange. The one question you might want to ask yourself is whether the fact that it makes sense to break the Customer entity into separate fragments on write doesn't also mean they might be better served separately on read. It would certainly make any HTTP caching more sensible. For example, if you PUT to entity fragments and then GET the parent, a subsequent PUT to the fragment only invalidates the fragment, and the parent may then serve stale data. It's more straightforward to GET a smaller parent entity (which has links to each entity fragment) and then GET each fragment, in which case a PUT to the fragment properly prompts subsequent GETs to retrieve a fresh copy.

score 0 · Answer 3 · edited Oct 07 '21 at 13:33

In contrast to Nicholas Shanks I think a wizard-like system is in line with the idea behind REST. It is not that uncommon to see such an interaction concept on the Web, i.e. usually applied on checkout pages where on one page you enter customer information, on the next page the delivery address and on a third page enter your payment data followed up by a confirmation page that on confirmation will trigger the actual order.

Jim Webber pointed out that in a REST architecture you primarily implement a domain application protocol (a state machine a client will traverse through, if you will) that client will follow along as they get all the information served by the server, either through links or form-like representation similar to HTML forms. This concept is summarized as HATEOAS.

So, an above-mentioned checkout system in a REST architecture might look like this: After putting an item into your basket the server offers you additional links that are annotated with some link relations (Pseudo-HAL representation):

{
    ...,
    "_links": {
        "self": {
            "href": "https://..."
        },
        "create-form": {
            "href": "https://shop.acme.com/checkout-wizard-p1"
        },
        "https://acme.com/rel/checkout": {
            "href": "https://shop.acme.com/checkout-wizard-p1"
        },
        ...
    }
}

The URI itself isn't the relevant thing here as the spelling of an URI is not of importance in a REST architecture as long as it conforms to the rules outlined in RFC 3986 (URI). It could also end up with a UUID instead of the checkout-wizard-p1 name given here for simplicity. Instead, focus is on the link-relation names given here, i.e. create-form, which conforms to the standardized link-relation registered at IANA and a custom link relation, https://acme.com/rel/checkout, that follows the extension mechanism outlined by Web Linking (RFC 5988). Introducing this indirection allows servers to replace the actual target URI with an other form and clients will still be able to progress through their task.

Unfortunately, unlike the Link header as defined by RFC 5988, which can define multiple relation names on the same URI (see the examples; Link: <.../checkout-wizard-p1>; rel="create-form https://acme.com/rel/checkout"), HAL JSON only allows to define one link relation name per URI AFAIK.

To an arbitrary client a link-relation name is just an arbitrary string. Also, the Web Linking extension mechanism does not have to point to a human-readable documentation to start with, though support for such relation names may be added later on through updates or plug-ins. The point here is, a client that understands a certain link relation name can act accordingly and will use the URI just to send requests to it. Link relations might be used in terms of a rule-engine that triggered accompanying URIs when certain criteria are met, i.e. the availability of a certain client state and the presence of link relations pointing to the same URI, as in this example with the given link-relation names.

Upon requesting the content of the URI used for either the create-form or the https://acme.com/rel/checkout relationship the server might return a HAL-FORMS representation that allows a client to enter the desired customer information:

{
  "_links": {
    "self": {
        "href": "https://shop.acme.com/checkout-wizard-p1"
    },
    "_templates": {
        "default": {
            "contentType": "application/x-www-form-urlencoded",
            "key": "default",
            "method": "POST",
            "properties": [
                { 
                    "name": "firstName",
                    "prompt": "First Name",
                    "readOnly": false,
                    "regex": "^[A-Z][a-z]{1,24}$",
                    "required": true,
                    "templated": false,
                    "value": "",
                    "maxLength": 25,
                    "minLength": 2,
                    "placeholder": "Your first name",
                    "type": "text"
                },
                ...
            ],
            "target": "https://shop.acme.com/tmp/ea2b3fb1-c640-40e2-b16a-f7433dee6ba2",
            "title": "Checkout Wizard - Customer Information"
        }
    }
  }
}

Similar to traditional HTML forms, HAL-FORMS teaches clients about the target URI to send the request to, the HTTP operation to use as well as the media-type to marshal the request to before sending the request. In contrast to HTML however HAL-FORMS defaults here to application/json rather than to application/x-www-form-urlencoded. Surprisingly only those two media-types are supported by HAL-FORMS though.

The overall structure a resource supports is also taught via the elements contained within the properties property. Like in HTML forms, the type of the presented input can be defined, which can be one of text, hidden, textarea, search, tel, url, email, password, date, month, week, time, datetime-local, number, range or color. Via an optional options element UI controls such as (multi-select) options, checkboxes and radio-buttons can be represented as well.

Once a client entered its data and send it to the server, the server can simply respond with the next HAL-FORMS representation asking for further input such as the shipping address and so on. Here, the interaction designer might lean towards updating the temporary resource (i.e. https://shop.acme.com/tmp/ea2b3fb1-c640-40e2-b16a-f7433dee6ba2 in this scenario) directly by using PUT, which HAL-FORMS support in contrast to HTML forms, though this would require to pass along all of the previous data as well, which makes the wizard more or less ... redundant. The next approach would be to make use of PATCH as HTTP operation to perform a partial update on the temporary resource, though as HAL-FORMS currently only supports application/json and application/x-www-form-urlencoded there is no way to inform a server about (implicit) instructions to transform the temporary resource to a desired outcome, as you would do with application/json-patch+json or application/merge-patch+json.

I therefore prefer to stick to POST here and create new temporary resources for each wizard page. Through support of hidden properties passing along all of the previous transmitted properties would easily be possible, though instead of passing along all the data it might be enough just to provide the URI of the temporary resources as hidden property. The final step could then "merge" the information in those temporary resources into one cohesive state, perform a clean up on the temporary resources affected and present the information for a last confirmation to the user in which case the actual resource is created.

While the actual form resources, that are obtained via GET, are cacheable, the temporary resources aren't as they are only operated on via unsafe operations, which would invalidate stored representations at (intermediary) caches anyway. Caching of form resources isn't a bad idea on top of it as new properties for resources aren't introduced all of the time and thus caching allows to reduce the overhead of transmitting that representation from the server to the client. In case an update to the form is done, an update on that resource, i.e. by uploading a new version via PUT or PATCH, would automatically invalidate any stored representations for that resource and server clients with the new version, bypassing the cache the first time until the response was added to the cache.

While Nicholas mentioned that REST should only be used in case all of his 4 points can be answered with yes, where some points such as the encryption of documents targets more the transport channel (i.e TLS over HTTP or HTTPS or the media-type negotiated) rather than the interaction concepts proposed by the REST architecture, in my sense REST should be aimed for in case your service should last for years to come, support a plethora of different clients and enable support for future evolution, like introducing new fields on a resource down the road, without having to fear breaking clients.

Long story short, designing a wizard-like interaction through HATEOAS does make sense, especially when form-like representations are used to teach clients on where to send the data to, what HTTP method to use and which representation format to send the data in. A form also helps teaching a client on the supported properties a resource has. The tricky part for use is how the data provided through the different pages of the wizard are combined and presented to the client. While PUT and PATCH might be attractive at first, at a more narrow glance they might not be ideal due to restrictions on the media-type or the HTTP operation itself.

How do I properly support partial creation of a record in a RESTful service design?

3 Answers3