10

I am working with AWS Textract and I want to analyze a multipage document, therefore I have to use the async options, so I first used startDocumentAnalysisfunction and I got a JobId as the return, But it needs to trigger a function that I have set to trigger when the SNS topic got a message.

These are my serverless file and handler file.

provider:
  name: aws
  runtime: nodejs8.10
  stage: dev
  region: us-east-1
  iamRoleStatements:
    - Effect: "Allow"
      Action:
       - "s3:*"
      Resource: { "Fn::Join": ["", ["arn:aws:s3:::${self:custom.secrets.IMAGE_BUCKET_NAME}", "/*" ] ] }
    - Effect: "Allow"
      Action:
        - "sts:AssumeRole"
        - "SNS:Publish"
        - "lambda:InvokeFunction"
        - "textract:DetectDocumentText"
        - "textract:AnalyzeDocument"
        - "textract:StartDocumentAnalysis"
        - "textract:GetDocumentAnalysis"
      Resource: "*"

custom:
  secrets: ${file(secrets.${opt:stage, self:provider.stage}.yml)}

functions:
  routes:
    handler: src/functions/routes/handler.run
    events:
      - s3:
          bucket: ${self:custom.secrets.IMAGE_BUCKET_NAME}
          event: s3:ObjectCreated:*

  textract:
    handler: src/functions/routes/handler.detectTextAnalysis
    events:
      - sns: "TextractTopic"

resources:
  Resources:
    TextractTopic:
        Type: AWS::SNS::Topic
        Properties:
          DisplayName: "Start Textract API Response"
          TopicName: TextractResponseTopic

Handler.js

module.exports.run = async (event) => {
  const uploadedBucket = event.Records[0].s3.bucket.name;
  const uploadedObjetct = event.Records[0].s3.object.key;

  var params = {
    DocumentLocation: {
      S3Object: {
        Bucket: uploadedBucket,
        Name: uploadedObjetct
      }
    },
    FeatureTypes: [
      "TABLES", 
      "FORMS"
    ],
    NotificationChannel: {
      RoleArn: 'arn:aws:iam::<accont-id>:role/qvalia-ocr-solution-dev-us-east-1-lambdaRole', 
      SNSTopicArn: 'arn:aws:sns:us-east-1:<accont-id>:TextractTopic'
    }
  };

  let textractOutput = await new Promise((resolve, reject) => {
    textract.startDocumentAnalysis(params, function(err, data) {
      if (err) reject(err); 
      else resolve(data);
    });
  });
}

I manually published an sns message to the topic and then it is firing the textract lambda, which currently has this,

module.exports.detectTextAnalysis = async (event) => {
  console.log('SNS Topic isssss Generated');
  console.log(event.Records[0].Sns.Message);
};

What is the mistake that I have and why the textract startDocumentAnalysis is not publishing a message and making it trigger the lambda?

Note: I haven't use the startDocumentTextDetection before using the startTextAnalysis function, though it is not necessary to call it before this.

BPDESILVA
  • 2,040
  • 5
  • 15
  • 35
gokublack
  • 1,260
  • 2
  • 15
  • 36
  • Does `qvalia-ocr-solution-dev-us-east-1-lambdaRole` have enough permissions to publish over SNS? – rpadovani Jun 25 '19 at 10:31
  • 2
    I am also working in amazon textract and the SNS publishing was working about a week ago and now it isn't. I have an application that I didn't change anything in the publishing and now it is broken. The dev must have broken it since it is open preview still. – griff4594 Jun 29 '19 at 16:29
  • @griff4594 I have the same problem and I'm like crazy trying to figure it out what is wrong with this. Thanks for your comment – Ruben J Garcia Jul 02 '19 at 14:46
  • @griff4594 I noticed that if a use a permit all policy in the role that push to SNS it works. I don't know what permission I'm forgetting to make it work – Ruben J Garcia Jul 04 '19 at 13:03
  • @RubenJGarcia I got mine working because of the IAM role I'm using was not allowing Textract specifically in the Trusted Relationships. ```{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": [ "lambda.amazonaws.com", "textract.amazonaws.com" ] }, "Action": "sts:AssumeRole" } ] }``` – griff4594 Jul 11 '19 at 21:34

5 Answers5

10

Make sure you have in your Trusted Relationships of the role you are using:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "lambda.amazonaws.com",
          "textract.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
griff4594
  • 484
  • 3
  • 15
  • This answer and in combination with ensuring that I have both the "sts:AssumeRole" and "sns:Publish" permissions is what worked for me. Thanks! – Christian Gossain Oct 28 '20 at 16:30
2

The SNS Topic name must be AmazonTextract

At the end your arn should look this:

arn:aws:sns:us-east-2:111111111111:AmazonTextract
Nikunj Kakadiya
  • 2,689
  • 2
  • 20
  • 35
  • Yes, if you use the aws-managed policy "AmazonTextractServiceRole", it is resource-restricted to "arn:aws:sns:*:*:AmazonTextract*", meaning that your SNS topic name must at least *start with* "AmazonTextract" – jameslol Jan 10 '22 at 23:10
0

If you have your bucket encrypted you should grant kms permissions, otherwise it won't work

Ruben J Garcia
  • 344
  • 4
  • 13
0

I was able got this working directly via Serverless Framework by adding a Lambda execution resource to my serverless.yml file:

resources:
  Resources:
    IamRoleLambdaExecution:
      Type: AWS::IAM::Role
      Properties:
        AssumeRolePolicyDocument:
          Version: "2012-10-17"
          Statement:
            - Effect: Allow
              Principal:
                Service:
                  - lambda.amazonaws.com
                  - textract.amazonaws.com
              Action: sts:AssumeRole

And then I just used the same role generated by Serverless (for the lambda function) as the notification channel role parameter when starting the Textract document analysis:

Thanks to this this post for pointing me in the right direction!

Christian Gossain
  • 5,942
  • 12
  • 53
  • 85
0

For anyone using the CDK in TypeScript, you will need to add Lambda as a ServicePrincipal as usual to the Lambda Execution Role. Next, access the assumeRolePolicy of the execution role and call the addStatements method.

The basic execution role without any additional statement (add those later)

  this.executionRole = new iam.Role(this, 'ExecutionRole', {
    assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
  });

Next, add Textract as an additional ServicePrincipal

  this.executionRole.assumeRolePolicy?.addStatements(
    new PolicyStatement({
      principals: [
        new ServicePrincipal('textract.amazonaws.com'),
      ],
      actions: ['sts:AssumeRole']
    })
  );

Also, ensure the execution role has full permissions on the target SNS topic (note the topic is created already and accessed via fromTopicArn method)

 const stmtSNSOps = new PolicyStatement({
    effect: iam.Effect.ALLOW,
    actions: [
      "SNS:*"
    ],
    resources: [
      this.textractJobStatusTopic.topicArn
    ]
  });

Add the policy statement to a global policy (within the active stack)

 this.standardPolicy = new iam.Policy(this, 'Policy', {
    statements: [
      ...
      stmtSNSOps, 
      ...
    ]
  });

Finally, attach the policy to the execution role

  this.executionRole.attachInlinePolicy(this.standardPolicy);
Matthew Pitts
  • 809
  • 1
  • 8
  • 13