I'm kind of new to coding in Python and I need your help.
My original dataframe is:
import pandas as pd
df=pd.DataFrame({'ProductArn': [ 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub', 'arn:aws:securityhub:eu-central-1::product/aws/securityhub'],
'GeneratorId': [ 'aws-foundational-security-best practices/v/1.0.0/SecretsManager.4', 'aws-foundational-security-best-practices/v/1.0.0/EC2.6', 'aws-foundational-security-best-practices/v/1.0.0/S3.4', 'aws-foundational-security-best-practices/v/1.0.0/S3.5', 'aws-foundational-guardduty-practices/v/1.0.0/SecretsManager.4', 'aws-foundational-splitfunction-practices/v/1.0.0/SecretsManager.4', 'aws-foundational-security-best-practices/v/1.0.0/S3.5', 'aws-foundational-security-best-practices/v/1.0.0/S3.5'],
'AwsAccountId': [ 961225000000.0, 961225000000.0, 961225000000.0, 961225000000.0, 961225000000.0, 961225000000.0, 971225000000.0, 971225000000.0],
'Types': ['Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices', 'Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices', 'Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices', 'Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices','Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices', 'Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices?', 'Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices?', 'Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices?'],
'Severity': [ '{Product: 40, Normalized: 40}', '{Product: 40, Normalized: 40}', '{Product: 40, Normalized: 40}', '{Product: 40, Normalized: 40}', '{Product: 40, Normalized: 40}','{Product: 40, Normalized: 40}', '{Product: 40, Normalized: 40}', '{Product: 40, Normalized: 40}'],
'Title': ['SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days', 'SecretsManager.4 Secrets Manager secrets should be rotated within a specified number of days'],
'ProductFields':['{StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', '{StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', '{StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', '{StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', '{StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', '{StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', {StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}', {StandardsArn: arn:aws:securityhub:::standards/aws-foundational-security-best-practices/v/1.0.0}'],
'Compliance': ['{Status: FAILED}', '{Status: FAILED}', '{Status: FAILED}', '{Status: FAILED}', '{Status: FAILED}', '{Status: FAILED}', '{Status: FAILED}', '{Status: FAILED}'],
'WorkflowState': [ 'NEW', 'NEW', 'NEW', 'NEW', 'NEW', 'NEW', 'NEW', 'NEW' ]})
As a final output I want to filter “generatorid” that contain “best-practice” plus the ones that contain “guardduty” and join both by “awsaccountid”.
So since my data frame has two awsaccountid and in generatorid I have 4 unique rows that contain "best-practice" for the awsaccountid 961225000000.0 and one that contains "guardduty" for aws accountid 961225000000.0 and 1 unique value for "best-practice" for the awsaccountid 971225000000.0, the final csv should output only 6 rows and its outputing the original dataset.
What I coded so far was:
pd.Series(["ProductArn", "GeneratorId", "Types", "Severity","Title", "ProductFields","Compliance","WorkflowState" ], dtype="string")
pd.Series(["ProductArn", "GeneratorId", "Types", "Severity","Title", "ProductFields","Compliance","WorkflowState"], dtype=pd.StringDtype())
df['AwsAccountId'] = df['AwsAccountId'].apply(np.int64)
df.groupby(['AwsAccountId']).filter(lambda gr: gr.GeneratorId.str.contains("best-practice","guardduty").any())
- but this groupby is not outputing what I need
In [3]: iwantthis + plus the rest of the other columns & their values: ProductArn, Types, Severity, Title, ProductFields, Compliance, WorkflowState
Out[3]:
AwsAccountId GeneratorId
0 961225000000 aws-foundational-security-best-practices/v/1.0.0/SecretsManager.4
1 961225000000 aws-foundational-security-best-practices/v/1.0.0/EC2.6
2 961225000000 aws-foundational-security-best-practices/v/1.0.0/S3.4
3 961225000000 aws-foundational-security-best-practices/v/1.0.0/S3.5
4 961225000000 aws-foundational-guardduty- practices/v/1.0.0/SecretsManager.4
5 971225000000 aws-foundational-security-best-practices/v/1.0.0/S3.5
Can someone help?
Thank you.