3

I have a python script which takes video and converts it to a series of small panoramas. Now, theres an S3 bucket where a video will be uploaded (mp4). I need this file to be sent to the ec2 instance whenever it is uploaded. This is the flow:

  1. Upload video file to S3.
  2. This should trigger EC2 instance to start.
  3. Once it is running, I want the file to be copied to a particular directory inside the instance.
  4. After this, I want the py file (panorama.py) to start running and read the video file from the directory and process it and then generate output images.
  5. These output images need to be uploaded to a new bucket or the same bucket which was initially used.
  6. Instance should terminate after this.

What I have done so far is, I have created a lambda function that is triggered whenever an object is added to that bucket. It stores the name of the file and the path. I had read that I now need to use an SQS queue and pass this name and path metadata to the queue and use the SQS to trigger the instance. And then, I need to run a script in the instance which pulls the metadata from the SQS queue and then use that to copy the file(mp4) from bucket to the instance. How do i do this? I am new to AWS and hence do not know much about SQS or how to transfer metadata and automatically trigger instance, etc.

akshay acharya
  • 143
  • 3
  • 15

2 Answers2

2

Your wording is a bit confusing. It says that you want to "start" an instance (which suggests that the instance already exists), but then it says that it wants to "terminate" an instance (which would permanently remove it). I am going to assume that you actually intend to "stop" the instance so that it can be used again.

You can put a shell script in the /var/lib/cloud/scripts/per-boot/ directory. This script will then be executed every time the instance starts.

When the instance has finished processing, it can call sudo shutdown now -h to turn off the instance. (Alternatively, it can tell EC2 to stop the instance, but using shutdown is easier.)

For details, see: Auto-Stop EC2 instances when they finish a task - DEV Community

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • This seems interesting and practical. Although if he is using queue isn't it better to stop the instance only when the queue is empty? Another question, what do you think about this solution for file processing instead of using mediaconvert? – ashraf minhaj Oct 23 '22 at 10:28
1

I tried to answer in the most minimalist way, there are many points below that can be further improved. I think below is still quite some as you mentioned you are new to AWS.

Using AWS Lambda with Amazon S3

Amazon S3 can send an event to a Lambda function when an object is created or deleted. You configure notification settings on a bucket, and grant Amazon S3 permission to invoke a function on the function's resource-based permissions policy.

When the object uploaded it will trigger the lambda function. Which creates the instance with ec2 user data Run commands on your Linux instance at launch.

For the ec2 instance make you provide the necessary permissions via Using instance profiles for download and uploading the objects.

user data has a script that does the rest of the work which you need for your workflow

  1. Download the s3 object, you can pass the name and s3 bucket name in the same script
  2. Once #1 finished, start the panorama.py which processes the videos.
  3. In the next step you can start uploading the objects to the S3 bucket.
  4. Eventually terminating the instance will be a bit tricky which you can achieve Change the instance initiated shutdown behavior

OR you can use below method for terminating the instnace, but in that case your ec2 instance profile must have access to terminate the instance.

ec2-terminate-instances $(curl -s http://169.254.169.254/latest/meta-data/instance-id)

You can wrap the above steps into a shell script inside the userdata.

Lambda ec2 start instance:

def launch_instance(EC2, config, user_data):

    ec2_response = EC2.run_instances(
        ImageId=config['ami'],  # ami-0123b531fc646552f
        InstanceType=config['instance_type'],
        KeyName=config['ssh_key_name'],
        MinCount=1,
        MaxCount=1,
        SecurityGroupIds=config['security_group_ids'],
        TagSpecifications=tag_specs,
        # UserData=base64.b64encode(user_data).decode("ascii")
        UserData=user_data
    )

    new_instance_resp = ec2_response['Instances'][0]
    instance_id = new_instance_resp['InstanceId']
    print(f"[DEBUG] Full ec2 instance response data for '{instance_id}': {new_instance_resp}")

    return (instance_id, new_instance_resp)

Upload file to S3 -> Launch EC2 instance

samtoddler
  • 8,463
  • 2
  • 26
  • 21
  • I am not able to edit my user data in the console. Even though my instance is stopped, there is no such option as 'Edit user data' in my instance settings. what do i do? – akshay acharya Feb 26 '21 at 11:36
  • @akshayacharya [Select the instance and choose Instance state, Stop instance. If this option is disabled, either the instance is already stopped or its root device is an instance store volume.](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#user-data-console) – samtoddler Feb 26 '21 at 11:38
  • How do I pass the file name and path to the user data file. How will the instance know what to download? And, what is the code to download file from s3 to a local directory within the instance? Also, I have an existing ec2 instance, so can i just call that? I dont have to create a new instance right? Cuz the program and all the directories are already in this instance – akshay acharya Feb 26 '21 at 12:14
  • Basically, I have an already existing instance. I want to start that and run the panorama.py inside it after the file is copied in a local directory inside the instance. Once it processes the video and output is generated, i want to stop the instance. This should be repeated for every new video uploaded into the bucket – akshay acharya Feb 26 '21 at 12:33
  • @akshayacharya in your lambda, when you start the instance you can pass the file name and s3 bucket name as variables. For donwloading the object you , you use [aws s3 cp](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/cp.html) command, and same for uploading. If you already have an instance in that case you just have to [star the instance](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.start_instances) instead of create new instance. – samtoddler Feb 26 '21 at 15:29
  • How do I start the py script inside the instance after the instance has started? – akshay acharya Mar 01 '21 at 03:43
  • Is there any example you can link me to? – akshay acharya Mar 01 '21 at 04:00
  • @akshayacharya I added the link in the answer where you can look for the detailed explanation [Upload file to S3 -> Launch EC2 instance](https://dev.to/nonbeing/upload-file-to-s3-launch-ec2-instance-7m9). You. can replace the call for `create instance` to start instance provided you have successfully updated the `userdata` . [How can I utilize user data to automatically run a script with every restart of my Amazon EC2 Linux instance?](https://aws.amazon.com/premiumsupport/knowledge-center/execute-user-data-ec2/) – samtoddler Mar 01 '21 at 08:06
  • @akshayacharya for starting the python script inside the shell script via the `userdata`. You can [execute python script inside the bash script](https://stackoverflow.com/a/4377147/2246345) – samtoddler Mar 01 '21 at 08:08
  • How do i pass the s3 bucket name and file path to the user data file or the instance? I will get it in the lambda fucntion right? How do I pass those variables before the instance is started and the user data starts execution? Can i copy the file to the instance before it starts? – akshay acharya Mar 02 '21 at 04:19
  • @akshayacharya you can pass the vars via the `userdata` and you also can modify the `userdata` [modify_instance_attribute](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.modify_instance_attribute), there before starting the instance you can modify and start your instance. – samtoddler Mar 03 '21 at 21:10