120

I'm adding and removing AWS IAM user policies programmatically, and I'm getting inconsistent results from the application of those policies.

For example, this may or may not succeed (I'm using the Java 1.6.6 SDK):

  1. Start with a user that can read from a particular bucket
  2. Clear user policies (list policies then call "deleteUserPolicy" for each one)
  3. Wait until the user has no user policies (call "listUserPolicies" until it returns an empty set)
  4. Attempt to read from the bucket (this should fail)

If I put in a breakpoint between #3 and #4 and wait a few seconds, the user cannot read from the bucket, which is what I expect. If I remove breakpoints, the user can read from the bucket, which is wrong.

(This is also inconsistent when I add a policy then access a resource)

I'd like to know when a policy change has had an effect on the component (S3, SQS, etc), not just on the IAM system. Is there any way to get a receipt or acknowledgement from this? Or maybe there is a certain amount of time to wait?

Is there any documentation on the internals of policy application?

(FYI I've copied my question from https://forums.aws.amazon.com/thread.jspa?threadID=140383&tstart=0)

petertc
  • 3,607
  • 1
  • 31
  • 36
Ed Norris
  • 4,233
  • 5
  • 27
  • 29

2 Answers2

103

The phrase "almost immediately" is used 5 times in the IAM FAQ, and is, of course, somewhat subjective.

Since AWS is a globally-distributed system, your changes have to propagate, and the system as a whole seems to be designed to favor availability and partition tolerance as opposed to immediate consistency.

I don't know whether you've considered it, but it's entirely within the bounds of possibility that you might actually, at step 4 in your flow, see a sequence of pass, fail, pass, pass, fail, fail, fail, fail... because neither a bucket nor an object in a bucket are actually a single thing in a single place, as evidenced by the mixed consistency model of different actions in S3, where new objects are immedately-consistent while overwrites and deletes are eventually consistent... so the concept of a policy having "had an effect" or not on the bucket or an object isn't an entirely meaningful concept since the application of the policy is, itself, almost certainly, a distributed event.

To confirm such an application of policies would require AWS to expose the capability of (at least indirectly) interrogating every entity that has a replicated copy of that policy to see whether it had the current version or not... which would be potentially impractical or unwieldy to say the least in a system as massive as S3, which has grown beyond a staggering 2 trillion objects, and serves peak loads in excess of 1.1 million requests per second.

Official AWS answers to this forum post provide more information:

While changes you make to IAM entities are reflected in the IAM APIs immediately, it can take noticeable time for the information to be reflected globally. In most cases, changes you make are reflected in less than a minute. Network conditions may sometimes increase the delay, and some services may cache certain non-credential information which takes time expire and be replaced.

The accompanying answer to what to do in the mean time was "try again."

We recommend a retry loop after a slight initial delay, since in most circumstances you'll see your changes reflected quite quickly. If you sleep, your code will be waiting far too long in most cases, and possibly not long enough for the rare exceptions.

We actively monitor the performance of the replication system. But like S3, we guarantee only eventual consistency, not any particular upper bound.

Community
  • 1
  • 1
Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427
  • 14
    I usually see the change in 5 or 10 seconds. It definitely is not instant, but it's not very slow. – Charles Engelke Nov 23 '13 at 15:04
  • 1
    Thanks for the reply - fortunately this is test code and I have the freedom to basically throw away my current approach and do something different . – Ed Norris Nov 25 '13 at 16:32
  • I ran into the same issue. Wrote a piece of script to scaffold a new CodePipeline project. Keep running into `InvalidStructureException: CodePipeline is not authorized to perform AssumeRole on role ` problem until I manually put 10 seconds wait between role creation and CodePipeline creation. – Trung Hieu Nguyen May 26 '20 at 16:28
40

I have a far less scientific answer here... but I think it will help some other people feel less insane :). I kept thinking things were not working while they were just taking more time than I expected.

Last night I was adding an inline policy to allow a host to get parameters from the system manager. I thought it wasn't working because many minutes after the change (maybe 5 or so), my CLI commands were still failing. Then, they started working. So, that was a fairly large delay.

Just now, I removed that policy and it took 2-3 minutes (enough to google this and read a couple other pages) before my host lost access.

Generally things are quite snappy for me as well, but if you're pretty sure something should work and it's not, just do yourself a favor and wait 10 minutes. Unfortunately, this makes automation after IAM changes sound harder than I thought!

John Humphreys
  • 37,047
  • 37
  • 155
  • 255
  • I created a new IAM key for SES. It works in us-east-1, but I'm "un-sandboxed" in eu-central-1, but there the key is claimed to be invalid I think it's more than an hour I created it; still not working. I'll sleep o it and see if it works tomorrow. – Leif Neland May 20 '20 at 23:32
  • 1
    Nope, 7hours later the key still only works in us-east-1 – Leif Neland May 21 '20 at 06:38
  • 6
    In my case, "almost immediately" meant 7 min. Taking time off was a more productive use of my time than keeping on refreshing and double checking constantly :) – Fabien Snauwaert Aug 06 '20 at 17:02
  • 5
    Thanks for confirming my suspicion. I find that roles that are attached to EC2 instances take significantly longer time to update than roles attached to IAM users. – Roger Far Dec 16 '20 at 20:24
  • "almost immediately" meant 10 minutes for me – jellycsc Mar 21 '23 at 19:15