Sorry for asking repeated question, because they didn't solve my problem which was already asked here before , How to convert pdf file from s3 to string variable using lambda function ,
My lambda function show the error
I find the below code in this answer but I am stuck in implement this code in lambda, please share your idea and I thing if the code in below is correct , the data variable will contain the string conversion of the pdf file in s3 . if No please give some suggestion to change my code
Unable to import module 'lambda_function': No module named 'pdfminer'
import json
import boto3
import botocore
import sys
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
from pdfminer.layout import LAParams
import io
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
filename = 'myfile'
s3.download_file(bucket,key, '/tmp/'+filename)
print('reading')
fp = open('/tmp/'+filename, 'rU').read()
rsrcmgr = PDFResourceManager()
retstr = io.StringIO()
codec = 'utf-8'
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
# Create a PDF interpreter object.
interpreter = PDFPageInterpreter(rsrcmgr, device)
# Process each page contained in the document.
for page in PDFPage.get_pages(fp):
interpreter.process_page(page)
data = retstr.getvalue()
print(data)