-1

I'm building an AWS Lambda to generate a JWT from a Pem. The example I'm using is taken from this link:

https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/generating-a-json-web-token-jwt-for-a-github-app

When I read the pem from a file, everything works ok. This is the code that does the job:

# Open PEM
with open(pem, 'rb') as pem_file:
    signing_key = jwt.jwk_from_pem(pem_file.read())

However, I need to get the content of the pem file from an AWS secret. Therefore, this is the code that I'm using:

# get PEM from AWS secret
secret =  get_secret()
signing_key = jwt.jwk_from_pem(secret)

When I run this code, I get the error below:

{
  "errorMessage": "from_buffer() cannot return the address of a unicode object",
  "errorType": "TypeError",
  "stackTrace": [
    "  File \"/var/task/handler.py\", line 19, in handler\n    signing_key = jwt.jwk_from_pem(secret)\n",
    "  File \"/var/task/jwt/jwk.py\", line 405, in jwk_from_pem\n    return jwk_from_bytes(\n",
    "  File \"/var/task/jwt/jwk.py\", line 384, in jwk_from_bytes\n    return jwk_from_private_bytes(\n",
    "  File \"/var/task/jwt/jwk.py\", line 328, in wrapper\n    return func(content, loader, **kwargs)\n",
    "  File \"/var/task/jwt/jwk.py\", line 345, in jwk_from_private_bytes\n    privkey = private_loader(content, password, backend)  # type: ignore[operator]  # noqa: E501\n",
    "  File \"/var/task/cryptography/hazmat/primitives/serialization/base.py\", line 24, in load_pem_private_key\n    return ossl.load_pem_private_key(\n",
    "  File \"/var/task/cryptography/hazmat/backends/openssl/backend.py\", line 949, in load_pem_private_key\n    return self._load_key(\n",
    "  File \"/var/task/cryptography/hazmat/backends/openssl/backend.py\", line 1169, in _load_key\n    mem_bio = self._bytes_to_bio(data)\n",
    "  File \"/var/task/cryptography/hazmat/backends/openssl/backend.py\", line 630, in _bytes_to_bio\n    data_ptr = self._ffi.from_buffer(data)\n"
  ]
}

It seems like for some reason, when I read the pem from the file, it has the correct encoding. However, when I get a String from the AWS Secret with the same value, it doesn't like the encoding.

Any suggestions?

-------------------------------EDIT---------------------------- Here's the get_secret function

def get_secret():

    secret_name = "pem"
    region_name = "eu-west-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        # For a list of exceptions thrown, see
        # https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
        raise e

    # Decrypts secret using the associated KMS key.
    secret = json.loads(get_secret_value_response['SecretString'])
    return (secret['pem'])
Andres
  • 10,561
  • 4
  • 45
  • 63
  • Where is `get_secret` defined? What is returned from it? – Karl Knechtel Mar 16 '23 at 00:11
  • 1
    This doesn't appear to be a question about encoding at all - so far. Hint: where the code says `with open(pem, 'rb') as pem_file:`, what do you think is the effect of the `b` in `'rb'`? What *type* do you expect from the `pem_file.read()` call? – Karl Knechtel Mar 16 '23 at 00:13
  • @KarlKnechtel it's a function that reads an AWS secret and returns a String. – Andres Mar 16 '23 at 00:13
  • 1
    Strings do not have an associated encoding. Python strings are fundamentally sequences of Unicode code points. – shadowtalker Mar 16 '23 at 00:13
  • 1
    Does `get_secret` return a `str`? If so, you need to convert it into a `bytes` using `.encode`. This function evidently expects a `bytes` object and not a `str`. You _will_ need to choose an encoding to perform this conversion, so that depends on what `jwk_from_pem` expects. – shadowtalker Mar 16 '23 at 00:13
  • @shadowtalker If you write an answer telling me how to do it and it works, I can give you the points – Andres Mar 16 '23 at 00:15
  • @Andres did you write the `get_secret` function? I can't answer until I know what it returns. – shadowtalker Mar 16 '23 at 00:16
  • 1
    If the question is "how do I decode the `get_secret` result to bytes?", the canonical duplicate is https://stackoverflow.com/questions/7585435. If you generally are confused about the distinction between strings and `bytes` objects, see https://stackoverflow.com/questions/6224052/. If the question is "what encoding should I use?", that depends: **if the `pem_file` contents actually represent text**, then use the encoding that that file uses. You need to figure out that encoding; see https://stackoverflow.com/questions/436220. Otherwise, *we need more information*. – Karl Knechtel Mar 16 '23 at 00:17
  • @shadowtalker I added the function to the question – Andres Mar 16 '23 at 00:18
  • Aha. So the next issue is: *what does the documentation say*, about the JSON result that comes from `client.get_secret_value`? For example, does it say that the `SecretString` key contains a string with a particular encoding? Maybe it says that it represents binary data, using an encoding such as Base64, or as a hex dump? Something else? It's not possible for people to just tell you how to "decode" or otherwise interpret arbitrary data; the correct way depends on how the data is formatted/structured, which should be documented somewhere. – Karl Knechtel Mar 16 '23 at 00:20
  • @KarlKnechtel I guess if you want to answer the question you can look at the boto3 documentation – Andres Mar 16 '23 at 00:31
  • The documentation tells me that the corresponding string is "The decrypted secret value, if the secret value was originally provided as a string or through the Secrets Manager console. If this secret was created by using the console, then Secrets Manager stores the information as a JSON structure of key/value pairs." - well; *that's the problem*; if you are storing a secret that you need to use as a binary value later, you should *store a binary value in the first place*. After all, I see no guarantee here that the PEM data you need to store, *sensibly represents text in the first place*. – Karl Knechtel Mar 16 '23 at 00:39

1 Answers1

2

A PEM file contains plain text that represents binary data using base64 encoding. It is, therefore, effectively using ASCII encoding, or any ASCII-transparent encoding: it should only contain the text characters used by Base64, which are all in the ASCII range, and it should represent those in the underlying file with one byte each. The jwt.jwk_from_pem interface apparently expects a bytes object; per the example in the linked documentation, this should be the raw data in the PEM file (i.e., even though it is still Base64-encoded; the library will take care of decoding that, despite that it expects binary rather than text data).

The stored secret, therefore, should be in a compatible format.

Since Base64 uses only ASCII characters, it will work fine to store textual data in the secret store, and read the ['SecretString'] from the response. This result is "The decrypted secret value, if the secret value was originally provided as a string or through the Secrets Manager console.". So, by providing a string that contains a Base64-representation of binary data (i.e., consists of uppercase letters, lowercase letters, digits, +, / and = - the latter used for padding), the secret store will store a usable value. To use it, we simply decode that string to binary using any ASCII-compatible encoding - such as, well, 'ascii' - and proceed as before.

Alternately, the secret store can store binary data, and the client can check the ['SecretBinary'] key of the JSON response. However, in this case, "The response parameter represents the binary data as a base64-encoded string." In other words, the binary data (that happens to consist of ASCII representations of Base64 data) is base-64 encoded in the result JSON (again). The client code would need to decode that.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153