I am organising AWS resources for tagging, and have captured data into a CSV file. A sample output of the CSV file is as follows. I am trying to make sure that for each resource_id, there is a dataset of tag_key that I need to ensure is present. This dataset is
tag_key
Application
Client
Environment
Name
Owner
Project
Purpose
I'm new to pandas, I've only managed to get the CSV file read as a dataframe
import pandas as pd
file_name = "z.csv"
df = pd.read_csv(file_name, names=['resource_id', 'resource_type', 'tag_key', 'tag_value'])
print (df)
CSV file
vol-00441b671ca48ba41,volume,Environment,Development
vol-00441b671ca48ba41,volume,Name,Database Files
vol-00441b671ca48ba41,volume,Project,Application Development
vol-00441b671ca48ba41,volume,Purpose,Web Server
i-1234567890abcdef0,instance,Environment,Production
i-1234567890abcdef0,instance,Owner,Fast Company
I am expecting the output to be as follows
vol-00441b671ca48ba41,volume,Environment,Development
vol-00441b671ca48ba41,volume,Name,Database Files
vol-00441b671ca48ba41,volume,Project,Application Development
vol-00441b671ca48ba41,volume,Purpose,Web Server
vol-00441b671ca48ba41,volume,Client,
vol-00441b671ca48ba41,volume,Owner,
vol-00441b671ca48ba41,volume,Application,
i-1234567890abcdef0,instance,Environment,Production
i-1234567890abcdef0,instance,Owner,Fast Company
i-1234567890abcdef0,instance,Application,
i-1234567890abcdef0,instance,Client,
i-1234567890abcdef0,instance,Name,
i-1234567890abcdef0,instance,Project,
i-1234567890abcdef0,instance,Purpose,