Our application implements APIs to access entity details via AJAX GET
requests - it once used POST
but we decided to change our standards.
Unfortunately we hit a design flaw recently.
Suppose our API is http://localhost/app/module/entity/detail/{id}
where {id}
is a Spring @PathVariable
that matches that entity's primary key.
If the primary key is a numeric surrogate key (auto-increment) there is no problem then. But we thought that this all worked with String
primary keys.
It happened that we found some valid production data contain slashes, backslashes and semicolons for primary key. Our tables cannot use auto-increment surrogate keys because of their gargantuar size.
More in general, we discovered that we were unprepared to handle non-alphanumeric characters.
What the defective code was originally
Here is how our application used to reach to display entity data in a single page form the entity list:
- User navigates
table.jsp
where an AJAX list is retrieved via POST
An Angular expression constructs the link to the detail page of that entity by means of detail?entityId={{::row:idAttribute}}
- A valid link is generated by Angular
Examples:
detail?entityId=5903475934 //numeric case, we have several details page with same navigation pattern
detail?entityID=AAABBBCCC
detail?entityID=DO0000099101\test
The user clicks on the link and the browser points to the corresponding address
http://localhost/blablabla/detail?entityId=DO0000099101\test //Could be escaped.... read more later
The page needs the ID code from the parameters to issue the correct AJAX call
Need to retrieve the Entity ID from the query string. The Angular controller is in a separate file that doesn't see the query string (and is included dynamically, having the same page name)
<script type="text/javascript">
var ____entityId = '${param.entityId}';
</script>
Which gets translated to
<script type="text/javascript">
var ____entityId = 'DO0000099101\test'; //Yes, I know it's incorrect because now we have a TAB
</script>
- Angular fetches the Entity ID into the page scope
In the Angular controller, we do the following
$scope.entityId = ___entityId;
$http.get($const.urlController+'detail/'+$scope.entityId)
Ajax call is issued
http://localhost/blablablabla/entity/detail/DO0000099101est //URL is bad
Spring MVC decodes the
@PathVariable
Unfortunately, the parameter is treated as DO0000099101est
What we tried to fix
We have tried to fix the mistake by escaping the \t
as it is clear that its presence in the HTML content constitutes a bug. Javascript interprets it as a tab.
- Trying to URL-escape the ID
We tried to manually navigate to http://localhost/blablabla/detail?entityId=DO0000099101%5Ctest
by escaping the backslash to its URL entity. The purpose is that if this worked we could modify Angular code in the table page
The result is that the \t
appeared again as it is in the Javascript fragment. From that point, the sequence is the same
<script type="text/javascript">
var ____entityId = 'DO0000099101\test';
</script>
- Trying to Java-escape + URL-Escape the ID in the URL
What if the Angular table page escaped the URL by this way?
http://localhost/blablabla/detail?entityId=DO0000099101%5C%5Ctest
<script type="text/javascript">
var ____entityId = 'DO0000099101\\test'; //Looks great
</script>
Unfortunately Angular now reverses the backslash into a forward slash when performing the Ajax request
http://localhost/blablabla/detail/DO0000099101/test
- Trying to perform the Ajax call manually with escaped URL
So know that the REST url is http://localhost/app/module/entity/detail/{id}
, let's try to see how Spring expects the backslash to be escaped
http://localhost/app/module/entity/detail/DO0000099101%5Ctest //results in 400 error
http://localhost/app/module/entity/detail/DO0000099101\\test //Chrome reversed the backslash into a forward slash and the result is 404 as expected for http://localhost/app/module/entity/detail/DO0000099101//test
- Using encodeURIComponent
We already tried this (but I didn't mention in the original post). To @Thomas comment, we tried again to encode the primary key by Javascript escaping in the controller itself.
$http.get($const.urlController+'detail/'+encodeURIComponent($scope.entityId))
The problem with this approach is that the backslash-T
sequence in the URL is encoded to %09
, which results in a tab
decoded on the server side
- Using both encodeURIComponent and a Java escape
Now we tried to use both approach #4 and fixing the hardcoded Java-escaped (i.e. \\t
) into the Javascript for testing purposes.
<script type="text/javascript">
var ____entityId = 'DO0000099101\\test';
</script>
$http.get($const.urlController+'detail/'+encodeURIComponent($scope.entityId))
This time the Ajax call resulted in 400 error as case #3
http://localhost/app/module/entity/detail/DO0000099101%5Ctest //results in 400 error
Question time
I want to ask this question in two ways, one specific and one more general.
- Specific: how can I properly format a
@PathVariable
so that it can preserve special characters? - General: in the REST world, what care is to be done when dealing with resource IDs that may contain special characters?
For the second, let me be more clear. If you control your REST application and generation of IDs, you can choose to allow only alphanumeric identifier (user-generated or random or whatever) and sleep happy.
But in our case we need to browse entities that come supplied from external systems that have broader restrictions on character sets allowed for entity primary identifiers. And again, we cannot use auto-increment surrogate keys.