23

I am trying to use AWS Step Functions to trigger operations many S3 files via Lambda. To do this I am invoking a step function with an input that has a base S3 key of the file and part numbers each file (each parallel iteration would operate on a different S3 file). The input looks something like

    {
      "job-spec": {
        "base_file_name": "some_s3_key-",
        "part_array": [
          "part-0000.tsv",
          "part-0001.tsv",
          "part-0002.tsv", ...
        ]
      }
    }

My Step function is very simple, takes that input and maps it out, however I can't seem to get both the file and the array as input to my lambda. Here is my step function definition

    {
      "Comment": "An example of the Amazon States Language using a map state to process elements of an array with a max concurrency of 2.",
      "StartAt": "Map",
      "States": {
        "Map": {
          "Type": "Map",
          "ItemsPath": "$.job-spec",
          "ResultPath": "$.part_array",
          "MaxConcurrency": 2,
          "Next": "Final State",
          "Iterator": {
            "StartAt": "My Stage",
            "States": {
              "My Stage": {
                "Type": "Task",
                "Resource": "arn:aws:states:::lambda:invoke",
                "Parameters": {
                  "FunctionName": "arn:aws:lambda:us-east-1:<>:function:some-lambda:$LATEST",
                  "Payload": {
                    "Input.$": "$.part_array"
                  }
                },
                "End": true
              }
            }
          }
        },
        "Final State": {
          "Type": "Pass",
          "End": true
        }
      }
    }

As written above it complains that that job-spec is not an array for the ItemsPath. If I change that to $.job-spec.array I get the array I'm looking for in my lambda but the base key is missing.

Essentially I want each python lambda to get the base file key and one entry from the array to stitch together the complete file name. I can't just put the complete file names in the array due to the limit limit of how much data I can pass around in Step Functions and that also seems like a waste of data

It looks like the Parameters value can be used for this but I can't quite get the syntax right

sedavidw
  • 11,116
  • 13
  • 61
  • 95
  • Looks like there is now a "new" Map State mode overcoming the limitation described in this question. It is the Distributed processing mode of Step Functions Map State : https://docs.aws.amazon.com/step-functions/latest/dg/concepts-inline-vs-distributed-map.html – Comencau Feb 10 '23 at 09:54

1 Answers1

45

Was able to finally get the syntax right.

"ItemsPath": "$.job-spec.part_array",
"Parameters": {
  "part_name.$": "$$.Map.Item.Value",
  "base_file_name.$": "$.job-spec.base_file_name"
},

It seems that Parameters can be used to create custom inputs for each stage. The $$ is accessing the context of the stage and not the actual input. It appears that ItemsPath takes the array and puts it into a context which can be used later.

UPDATE Here is some AWS Documentation showing this being used from the comments below

sedavidw
  • 11,116
  • 13
  • 61
  • 95
  • 3
    Awesome, thank you! In my case, I wanted to pass in a parameter (s3 bucket name) generated in a separate resource. Using the `$$.Map.Item.Value` and `.$` was key for pairing the two when using Input, and then normal `Key: Ref! MyResource` in the external parameter. – Miles Mar 09 '20 at 15:02
  • 1
    It would be helpful if you could share a blog post or documentation on this, the AWS documentation is pretty thin on the details of this rather complicated means of passing data between states. In my case, I have one SF calling another, and need to pass an array of input to SF 2's Map state, so that each element of the array of input can be passed as a named parameter to a Glue job. – Chris Ivan Jan 20 '22 at 03:12
  • Acccess to $$.Map.Item.Value is demonstrated, but not very thoroughly explained, in the SF documentation https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-map-state.html – Rüdiger Schulz Apr 04 '22 at 09:01
  • https://docs.aws.amazon.com/step-functions/latest/dg/input-output-contextobject.html#contextobject-map – Kbalsamy Aug 26 '22 at 16:54