0

When using textract from the paws package in R the start_document_analysis call requires the path to a S3Object in DocumentLocation.

textract$start_document_analysis(
    DocumentLocation = list(
      S3Object = list(Bucket = bucket, Name = file)
    )
  )

Is it possible to use DocumentLocation without a S3Object? I would prefer to just provide the path to a local PDF.

Ramón J Romero y Vigil
  • 17,373
  • 7
  • 77
  • 125
jkortner
  • 549
  • 2
  • 8
  • 23

1 Answers1

1

The start_document_analysis api only supports providing an s3 object as input, and not a base64 encoded string like the analyze_document api (see also CLI docs on https://docs.aws.amazon.com/cli/latest/reference/textract/start-document-analysis.html)

So unfortunately you have to use S3 as a place to (temporarily) store your data. Of course you can write your own logic to do that :). Great tutorial on that can be found at https://www.gormanalysis.com/blog/connecting-to-aws-s3-with-r/ Since you have already set up credentials etc. you can skip a lot of the steps and start at step 3 for example.

LRutten
  • 1,634
  • 7
  • 17