6

I'm making a simple CRUD application with MongoDB so I can learn more about it.

The application is a simple blog, I have a collection named "articles" which stores various documents, each one representing a post for my blog.

When I display the list of all blog posts, I can do a db.collection.find(), and list all of them.

But the question lies when I need to show a single post individually, when I need to query the collection for a single, specific document.

The logical solution would be to use a RDBMS and an auto increment feature, but MongoDB is NoSQL and does not have auto increment.

I'm using the auto generated _id field of the document which stores an ObjectId by default, which means that my url's look like this:

http://localhost/blog/article.php?_id=5d41f6e5fc1a2f3d80645185

I saw in the documentation that the ObjectId contains a unique identifier for the server, together with a timestamp and a counter, isn't exposing these things a security risk?

As a solution, I stumbled into UUID https://docs.mongodb.com/manual/reference/method/UUID/ which is an auto-generated unique ID, that doesn't expose timestamp and machine info in it. It seems like a logical solution to use this instead of the _id that contains my ObjectId for querying and finding a document.

So I can make my url's look like this:

http://localhost/blog/article.php?_id=23829651-26f7-4092-99d0-5be8658c966e

But still, should I keep the _id property? should I add another one called "id" that stores the UUID? should I even use UUID's at all?

gtbono
  • 423
  • 4
  • 17
  • A different option would be to create stubs from the title. The first X number of characters, strip everything but spaces and letters, and replace spaces with dashes. Instead of `http://localhost/blog/article.php?_id=5d41f6e5fc1a2f3d80645185` you could get `http://localhost/blog/article.php?post=how-i-learned-to-draw`; or if you handled 404s with a custom error pages that checks the database to see if the post exists, you could have `http://localhost/blog/how-i-learned-to-draw` – zbee Aug 08 '19 at 19:09
  • Additionally, it's just an identifier for your machine and is not inherently very sensitive data, just unique and really only [theoretically a vulnerability](https://stackoverflow.com/a/4588089/1843510). UUID is more appealing, but long - it certainly exposes zero information about your systems however. – zbee Aug 08 '19 at 19:18

1 Answers1

2

Here's what I would consider before choosing an identifier:

Collision

Risk of collision is very low for both UUIDs and ObjectIDs. This has been discussed in detail in another question.

Nature

UUIDs are random whereas ObjectID values always increase over time. This makes ObjectIDs a bad choice for sharding.

Other uses

ObjectIDs have the creation timestamp as a part and can be used as a substitute of commonly used the createdAt field. A sort by ObjectIDs is a sort by creation time.

Insecure object references (OWASP)

Short def: An attacker cannot deduce the ID of another object if they have the ID of one object. You can read more about this here. Both UUIDs and ObjectIDs are not vulnerable to this.
Link to another question that discusses the security of ObjectIDs (thanks zbee).

Ease of use

Note: This is subjective
Using ObjectIds is a lot easier in the Mongo ecosystem. The existence of speical aggregation operators to deal with ObjectIDs + libraries add to it.

Portability

UUIDs are more portable than ObjectIDs. I do not know of any other system that uses ObjectIDs internally except for Mongo. Whereas there are other DBs such as Postgres that have a special data type for UUIDs + extensions for random generation etc.

Ramit Mittal
  • 483
  • 3
  • 12