I have an AWS Lambda function configured to call start_file_transfer on an AWS Transfer Family SFTP Connector at X minute intervals. The problem I'm having is that sometimes the Lambda runs again before files can finish downloading, and it's understandably causing some file locking errors.
The obvious solution to me is: when the Lambda runs, check for current downloads in progress and skip them.
The start_file_transfer
call:
transfer.start_file_transfer(
ConnectorId="[my-connector-id]",
RetrieveFilePaths=[sftp_file],
LocalDirectoryPath="/[my-s3-bucket]/[my-s3-key]"
)
returns the following response:
{
"TransferId": "3296exe9-21fy-063r-a97n-2c91e6u3ab81",
"ResponseMetadata": {
"RequestId": "l5au1f5e-v59u-763w-h7gr-9461y268135o",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"date": "Wed, 30 Aug 2023 15:30:30 GMT",
"content-type": "application/x-amz-json-1.1",
"content-length": "6429",
"connection": "keep-alive",
"x-amzn-requestid": "l5au1f5e-v59u-763w-h7gr-9461y268135o"
},
"RetryAttempts": 0
}
}
and when the transfer operation completes a CloudWatch log is created:
{
"operation": "RETRIEVE",
"timestamp": "2023-08-30T16:30:33.227572Z",
"connector-id": "[my-connector-id]",
"transfer-id": "3296exe9-21fy-063r-a97n-2c91e6u3ab81",
"file-transfer-id": "3296exe9-21fy-063r-a97n-2c91e6u3ab81/F6drt7oppd2+87YuTREWWW",
"url": "sftp://sftp.example.com",
"file-path": "my_path/file.txt",
"status-code": "COMPLETED",
"start-time": "2023-08-30T16:30:32.102939Z",
"end-time": "2023-08-30T16:30:32.886031Z",
"account-id": "999999999999",
"connector-arn": "arn:aws:transfer:[my-region]:999999999999:connector/[my-connector-id]",
"local-directory-path": "/[my-s3-bucket]/[my-s3-key]"
}
However, I can't find a way to natively monitor the 'in progress' downloads in between these operations. Nowhere in AWS do I see a way to query any resources using the returned TransferId
. Any suggestions on how to natively query in-flight downloads on an SFTP Connector? Or is my best bet to build and maintain my own DynamoDB table to manage state?