Our system is moving from a monolithic to a microservice architecture. The microservice architecture comes with technical challenges that we need to address and one of them is AuthN/AuthZ.
Our approach is to have an authentication service that would authenticate users and generate access/refresh tokens (JWTs). Access tokens would then be passed in request header through the chain of microservices such that each microservice just have to validate the token to determine the user has been successfully authenticated. For the AuthZ part, permission enforcement is done in the microservice itself. My questions are related to AuthZ.
To illustrate the talk, let’s take a specific example of a receptionist who wants to register a new member to his fidelity program, for instance from a Web application. To support this use case, let’s assume the system has 2 microservices, the ReceptionService and the MemberService. The ReceptionService offers one REST API to initiate the member registration flow. It requires user permission “registration” to allow execution. The MemberService offers one REST API to create a new member resource which is protected by CRUD permissions. A request flow would be:
- The web application, on which the user has previously logged in, sends a member registration request to the ReceptionService API including the user access token in the header.
- The ReceptionService validates the user token, ensures user is granted with permission “registration”, does whatever business logic it needs to do, and finally sends a member creation request to the MemberService API including the user access token in the header.
- The MemberService validates the user token, ensures user is granted with permission “member.create”, and finally creates the member.
To design a solution for such case, my team worked on the following assumptions/prerequisites:
- A microservice must always enforce permission (at least for significant API operation such as creating a member). Thus the CRUD permissions on the MemberService in the example above even if Products Managers might only require the top-level “registration” permission.
- A user who is able to start a use case because it has the “top-level” permission must be able to complete it. Meaning It shall not get errors because he is lacking another permission from somewhere in the underlying services’ call chain.
- Admin users shall not have to understand the chain of permissions that is required to perform a use case. In our example, Admins should be able to provide users only with the “registration” permission.
To be able to complete the above example, there are 2 different permissions to be assigned to the user, which breaks some of our assumptions/prerequisites. To overcome that, one of my colleague proposed to consider declaring microservices as identities/users in our AuthN system so that they could be assigned with the appropriate permissions. The user token initially provided would then be replaced by participating services token among the call chain. To come back to the example, the new request flow would be:
The web application, on which the user has previously logged in, sends a member registration request to the ReceptionService API including the user access token in the header.
The ReceptionService validates the user token, ensures user is granted with permission “registration”, does whatever business logic it needs to do, and finally sends a member creation request to the MemberService API including its own service token in the header (and so replacing the original user token).
The MemberService validates the service token, ensures service is granted with permission “member.create”, and finally creates the member.
With this solution, service’s identities in the AuthN system would be flagged in a way that they are filtered from an Admin user managing permission assignments. Permission assignments to services identities would be pre-defined with no possibility for a user to configure it. While it fulfills our assumptions/pre-requisites, I have few concerns about this approach:
- When dealing with the “who did what” (audit), user identity and service identity provided in the tokens would be listed indifferently. In our example, the RegistrationService would audit the actual user who initiated the operation but the MemberService would audit that the operation was performed by the “RegistrationService”. In reporting scenarios, it means I would need to reconciliate audit from both systems to determine “who actually did create the member” using somehow a correlation ID.
- While I understand the need to create an identity for a system component in scenarios which do not involve an actual user (automated batch/third party access ..), I am not comfortable with replacing the user token with a service token in scenarios where a user actually initiated the use case. Is that a standard design pattern?
Could it be that some of our assumptions/prerequisites just wrong?
- For instance, is it really a security hole that some microservices do
not enforce permission even if they are only accessed by others
controlled microservices in a safe environment? Assuming the answer
to the latter is “no, it would not be a security hole”, then what if tomorrow, I need to make the MemberService API also accessible
outside of the safe environment (for instance, because I make it
available to a third party). I would most likely need to add a
permission on it, which would break my registration flow. - Is it wrong to say we do want Admin users to know which set of
permissions are required for a use case and that we should rather
build the system so it gracefully handles failures due to lack of one permission in the call chain (maybe using Sagas and compensation
routines)?
Any comment or links to resources would be greatly appreciated. Thanks!