2

We're developing a REST API for our platform. Let's say we have organisations and projects, and projects belong to organisations.

After reading this answer, I would be inclined to use numerical ID's in the URL, so that some of the URLs would become (say with a prefix of /api/v1):

/organisations/1234
/organisations/1234/projects/5678

However, we want to use the same URL structure for our front end UI, so that if you type these URLs in the browser, you will get the relevant webpage in the response instead of a JSON file. Much in the same way you see relevant names of persons and organisations in sites like Facebook or Github.

Using this, we could get something like:

/organisations/dutchpainters
/organisations/dutchpainters/projects/nightwatch

It looks like Github actually exposes their API in the same way.

The advantages and disadvantages I can come up with for using names instead of IDs for URL definitions, are the following:

Advantages:

  • More intuitive URLs for end users
  • 1 to 1 mapping of front end UI and JSON API

Disadvantages:

  • Have to use unique names
  • Have to take care of conflict with reserved names, such as count, so later on, you can still develop an API endpoint like /organisations/count and actually get the number of organisations instead of the organisation called count.

Especially the latter one seems to become a potential pain in the rear. Still, after reading this answer, I'm almost convinced to use the string identifier, since it doesn't seem to make a difference from a convention point of view.

My questions are:

  • Did I miss important advantages / disadvantages of using strings instead of numerical IDs?
  • Did Github develop their string-based approach after their platform matured, or did they know from the start that it would imply some limitations (like the one I mentioned earlier, it seems that they did not implement such functionality)?
Sventies
  • 2,314
  • 1
  • 28
  • 44

3 Answers3

3

It's common to use a combination of both:

/organisations/1234/projects/5678/nightwatch

where the last part is simply ignored but used to make the url more readable.

In your case, with multiple levels of collections you could experiment with this format:

/organisations/1234/dutchpainters/projects/5678/nightwatch

If somebody writes

/organisations/1234/germanpainters/projects/5678/wanderer

it would still map to the rembrandt, but that should be ok. That will leave room for editing the names without messing up url:s allready out there. Also, names doesn't have to be unique if you don't really need that.

Andreas Zita
  • 7,232
  • 6
  • 54
  • 115
0

Reserved HTTP characters: such as “:”, “/”, “?”, “#”, “[“, “]” and “@” – These characters and others are “reserved” in the HTTP protocol to have “special” meaning in the implementation syntax so that they are distinguishable to other data in the URL. If a variable value within the path contains one or more of these reserved characters then it will break the path and generate a malformed request. You can workaround reserved characters in query string parameters by URL encoding them or sometimes by double escaping them, but you cannot in path parameters.

https://www.serviceobjects.com/blog/path-and-query-string-parameter-calls-to-a-restful-web-service

Community
  • 1
  • 1
Michael Gorman
  • 1,077
  • 7
  • 13
  • Thanks, I'm aware of the (very commonly used) query parameters and similar, and we will definitely use them at some point, but right now I'm really just looking how to set up my basic path parameters. – Sventies May 31 '17 at 14:31
  • Yes in query strings you can work around this, but with path parameters you cannot escape them. For your example the organization name cannot contain these characters or the url will break. This probably wont be a major concern if you control the data that is in the data base. But if the data is open to public creation or editing, you will need to specifically block these characters from the field. – Michael Gorman May 31 '17 at 16:31
0

Numerical consecutive IDs are not recommended anymore because it is very easy to guess records in your database and some might use that to obtain info they do not have access to.

Numerical IDs are used because the in the database it is a fixed length storage which makes indexing easy for the database. For example INT has 4 bytes in MySQL and BIGINT is 8 bytes so the number have the same length in memory (100 in INT has the same length as 200) so it is very easy to index and search for records.

If you have a lot of entries in the database then using a VARCHAR field to index is a bad idea. You should use a fixed width field like CHAR(32) and fill the difference with spaces but you have to add logic in your program to treat the differences when searching the database.

Another idea would be to use slugs but here you should take into consideration the fact that some records might have the same slug, depends on what are you using to form that slug. https://en.wikipedia.org/wiki/Semantic_URL#Slug

I would recommend using UUIDs since they have the same length and resolve this issue easily.

Alex Efimov
  • 3,335
  • 1
  • 24
  • 29
  • Hmm, I guess that depends on your security implementation right? I think your SO ID is 6676502... – Sventies May 31 '17 at 14:28
  • 1
    Yeah sure, it depends, but I also encountered issues with it in many apps, where by incrementing the ids it would return info about other records that were not linked to my account. – Alex Efimov May 31 '17 at 14:31
  • you could use a Slug to identify records, again with a strict length but here you should treat collisions (having the same slug for the same record) – Alex Efimov May 31 '17 at 14:35