25

I have an AWS Lambda deployed successfully with Terraform:

resource "aws_lambda_function" "lambda" {
  filename                       = "dist/subscriber-lambda.zip"
  function_name                  = "test_get-code"
  role                           = <my_role>
  handler                        = "main.handler"
  timeout                        = 14
  reserved_concurrent_executions = 50
  memory_size                    = 128
  runtime                        = "python3.6"
  tags                           = <my map of tags>
  source_code_hash               = "${base64sha256(file("../modules/lambda/lambda-code/main.py"))}"
  kms_key_arn                    = <my_kms_arn>
  vpc_config {
    subnet_ids         = <my_list_of_private_subnets>
    security_group_ids = <my_list_of_security_groups>
  }
  environment {
    variables = {
      environment = "dev"
    }
  }
}

Now, when I run terraform plan command it says my lambda resource needs to be updated because the source_code_hash has changed, but I didn't update lambda Python codebase (which is versioned in a folder of the same repo):

  ~ module.app.module.lambda.aws_lambda_function.lambda
  last_modified:                     "2018-10-05T07:10:35.323+0000" => <computed>
  source_code_hash:                  "jd6U44lfe4124vR0VtyGiz45HFzDHCH7+yTBjvr400s=" => "JJIv/AQoPvpGIg01Ze/YRsteErqR0S6JsqKDNShz1w78"

I suppose it is because it compresses my Python sources each time and the source changes. How can I avoid that if there are no changes in the Python code? Is my hypothesis coherent if I didn't change the Python codebase (I mean, why then the hash changes)?

Arcones
  • 3,954
  • 4
  • 26
  • 46

5 Answers5

17

This is because you are hashing just main.py but uploading dist/subscriber-lambda.zip. Terraform compares the hash to the hash it calculates when the file is uploaded to lambda. Since the hashing is done on two different files, you end up with different hashes. Try running the hash on the exact same file that is being uploaded.

Kon
  • 4,023
  • 4
  • 24
  • 38
  • 1
    super! I resolved this by replacing the hashing of main.py by the hash of all dist/subscriber-lambda.zip. Moreover, as I have created that zip via the terraform archive_file function, I have been able to use the ouput of that archive_file source_code_hash = "${data.archive_file.lambda.output_base64sha256}" – Arcones Oct 11 '18 at 07:31
  • @Kon I am confused about how do you create a hash of the file that will be stored in S3? I am using an `aws_s3_bucket_object` resource to upload a jar file to s3 and in lambda resource, I am using the s3 key to specify S3 file location. – Manoj Acharya Dec 09 '19 at 12:16
  • Just make sure you're hashing the exact file you're uploading. In your case it would be the jar file. – Kon Dec 09 '19 at 23:45
  • 1
    @Kon yeah worked! One thing to note is in the case of jar files. Need `filebase64sha256()` function to create the hash value. – Manoj Acharya Dec 20 '19 at 15:20
  • 2
    Any idea's how to get around this in a CI/CD scenario? For example i'm running into this because when CI/CD kicks off it builds and zips my code. Then it runs terraform apply. The source didn't actually change but the hash of the zip has changed. I was hoping terraform would know it didn't need to redeploy in this scenario. – Tyson Steele Decker Apr 23 '20 at 14:54
  • @TysonSteeleDecker: Possibly a little late, but here's how I got around this - I removed source_code_hash completely (which basically leads to lambda/lambda_layer to not be updated at all). Then, I added a script with 'data "external"' which builds the zip for me and calculates a hash (however you wanna do this; e.g., for Python Poetry, I used the content_hash from poetry.lock). Now I append this hash to the lambda/lambda_layer zip file name. Hence, only when that hash changes (which makes the zip file name change as well), terraform triggers an update. – mikey Nov 19 '21 at 14:26
17

This works for me and also doesn't trigger an update on the Lambda function when the code hasn't changed

data "archive_file" "lambda_zip" {                                                                                                                                                                                   
  type        = "zip"                                                                                                                                                                                                
  source_dir  = "../dist/go"                                                                                                                                                                                         
  output_path = "../dist/lambda_package.zip"                                                                                                                                                                         
}                                                                                                                                                                                                                    


resource "aws_lambda_function" "aggregator_func" {                                                                                                                                                                   
  description      = "MyFunction"                                                                                                                                                                       
  function_name    = "my-func-${local.env}"                                                                                                                                                                  
  filename         = data.archive_file.lambda_zip.output_path                                                                                                                                                        
  runtime          = "go1.x"                                                                                                                                                                                         
  handler          = "main"                                                                                                                                                                                    
  source_code_hash = data.archive_file.lambda_zip.output_base64sha256                                                                                                                                                
  role             = aws_iam_role.function_role.arn                                                                                                                                                                  


  timeout = 120                                                                                                                                                                                                      
  publish = true                                                                                                                                                                                                     

  tags = {                                                                                                                                                                                                           
    environment = local.env                                                                                                                                                                                                                                                                                                                                                                    
  }                                                                                                                                                                                                                  
}                              
14

I'm going to add my answer to contrast to the one @ODYN-Kon provided.

The source code hash field in resource "aws_lambda_function" is not compared to some hash of the zip you upload. Instead, the hash is merely checked against the Terraform saved state from the last time it ran. So, the next time you run Terraform, it computes the hash of the actual python file to see if it has changed. If it has, it assumes that the zip has been changed and the Lambda function resource needs to be run again. The source_code_hash can have any value you want to give it or it can be omitted entirely. You could set it to a constant of some arbitrary string, and then it would never change unless you edit your Terraform configuration.

Now, the problem there is that Terraform assumes you updated the zip file. Assuming you only have one directory or one file in the zip archive, you can use the Terraform data source archive_file to create the zip file. I have a case where I cannot use that because I need a directory and a file (JS world: source + node_modules/). But here is how you can use that:

data "archive_file" "lambdaCode" {
  type = "zip"
  source_file = "lambda_process_firewall_updates.js"
  output_path = "${var.lambda_zip}"
}

Alternativly, you can archive an entire directory, if you replace the "source_file" statement with source_dir = "node_modules"

Once you do this, you can reference the hash code of the zip archive file for insertion into resource "aws_lambda_function" "lambda" { block as "${data.archive_file.lambdaCode.output_base64sha256}" for the field source_hash. Then, anytime the zip changes, the lambda function gets updated. And, the data source archive file knows that anytime the source_file changes it must regenerate the zip.

Now, I haven't drilled down to a root cause in your case, but hopefully given some help to get to a better place. You can check the saved state of Terraform via: tf state list - which lists the items of saved state. You can find the one that matches your lambda function block and then execute tf state show <state-name>. For example, for one I am working on:

tf state show aws_lambda_function.test-lambda-networking gives about 30 lines of output, including:

source_code_hash = 2fKX9v/duluQF0H6O9+iRnID2gokhfpXIXpxyeVBUM0=

You can compare the hash via command line commands. Example on MacOS: sha256sum my-lambda.zip, where sha256sum was installed by brew install coreutils.

As mentioned, the use of archive_file doesn't work when you have multiple elements of the zip which are not isolated to a single directory. I think that probably happens a lot, so I wish the Hashicorp guys would extend archive_file to support multiple. I even went looking at the Go code, but that is a rainy day project. One variation I use is to take the source_code_hash to be "${base64sha256(file("my-lambda.zip"))}". But that still requires me to run tf twice.

Kevin Buchs
  • 2,520
  • 4
  • 36
  • 55
  • 2
    according to this https://stackoverflow.com/questions/52662244/terraform-lambda-source-code-hash-update-with-same-code and my own testing, terraform stores the hash of the uploaded zip regardless of what you provide as the source_code_hash – nick fox Jun 06 '19 at 11:48
  • Try to avoid `achieve_file`. [Ref](https://github.com/hashicorp/terraform/issues/18422) – BentCoder Mar 27 '22 at 15:37
1

As others have said, your zip should be used in your filename and your hash.

I want to mention that you can also get similar recreation issues if you use the wrong hash function in your lambda definitions. For example filesha256(.zip) will also recreate your lambdas every time. You have to use filebase64sha256("file.zip") (terraform 0.11.12+) or base64sha256(file("file.zip")) as mentioned under source_code_hash here

adavea
  • 1,535
  • 1
  • 19
  • 25
0

Just chiming in a few years late here to say that Mikey's comment on the original post was the answer for me and sadly I cannot upvote it as I have no reputation. Therefore, I will fully expand upon his answer here as none of the other solutions worked for me.

My problem was that my lambda zip was rebuilt every time the CI/CD process ran, which would therefore fetch all the pip packages again and therefore the timestamps between zips would be all out of whack leading to different hashes. As other answers on similar threads have pointed out there are now a few libraries floating around called deterministic zip you could look into.

Now onto Mikey's answer as this is pure magic. By removing the source_code_hash from your lambda block in terraform, terraform will now only redeploy if the name of the file changes. Therefore, with no other changes terraform will deploy the first time its created and never again. The beauty of this is that we are then able to effectively put whatever hashing we want in the file's name. So a simple example would be to calculate the md5 hash of your actual source code file in bash like:

md5sum <modules/lambda/lambda-code/main.py | sed 's/ .*$//' 

Note that the < prior to the filename actually opens the file without that you will simply calculate md5 on the file's name (so it won't change). If you try running that on two different runs where there is no change to main.py the md5 will be the same. You can experiment with making any changes to main.py and the md5 will change. This the core idea behind hashing in the name, our versioning or hashing will take place in the filename since we have full control over this. You could also calculate the hash on your requirements.txt and append them together for the filename if you anticipate changing dependencies.

Finally, we will need to change some logic in the terraform to get this to work. As I stated previously we are going to remove the source_code_hash. To import the new variable filename, since the hash is calculated at runtime we do not know its name, you can use multiple methods, I will demonstrate the environment variable route. The environment variable needs TF_VAR_ as a prefix.

So in your CI/CD tool prior to running terraform you will now need to calculate the hash such as:

lambda_zip=$(md5sum <modules/lambda/lambda-code/main.py | sed 's/ .*$//').zip#calculate new name the string parsing is to remove *- at end of hash
mv lambda_zip.zip ${lambda_zip}#rename the zip with the new hashed name
export TF_VAR_lambda_zip=${lambda_zip}#set TF env var

And in your terraform:

variable "lambda_zip" {
      type = string
      description = "..."
}
resource "aws_lambda_function" "lambda" {
  filename                       = "${var.lambda_zip}"
  function_name                  = "test_get-code"
  role                           = <my_role>
  handler                        = "main.handler"
  timeout                        = 14
  reserved_concurrent_executions = 50
  memory_size                    = 128
  runtime                        = "python3.6"
  tags                           = <my map of tags>
  kms_key_arn                    = <my_kms_arn>
  vpc_config {
    subnet_ids         = <my_list_of_private_subnets>
    security_group_ids = <my_list_of_security_groups>
  }
  environment {
    variables = {
      environment = "dev"
    }
  }
}

And that's pretty much it. Now if you don't make any changes to your code the md5 hash will calculate the same every run and therefore terraform will not redeploy. But if you make edits to your code, the md5 hash will change, so the filename changes, and now terraform will redeploy your zip.