select column from panda dataframe where a given string is contained in another column that contains a list

Question

My Panda Dataframe looks like this:

id                            rule  impact           tags                          description                                           examples
0   1          \(\)\s*\{.*?;\s*\}\s*;       9    [rce, bash]           Shellshock (CVE-2014-6271)  [env x='() { :;}; echo vulnerable' bash -c "ec...
1   2  \(\)\s*\{.*?\(.*?\).*?=>.*?\\'       9    [rce, bash]           Shellshock (CVE-2014-7169)  [env X='() { (a)=>\' bash -c "echo date"; cat ...
2   3                     \{\{.*?\}\}       4      [rce, id]                   Flask curly syntax                                      [{{foo.bar}}]
3   4   \bfind_in_set\b.*?\(.+?,.+?\)       6  [sqli, mysql]  Common MySQL function "find_in_set"                [SELECT FIND_IN_SET('b','a,b,c,d')]
4   5                        ["'].*?>       3          [xss]                        HTML breaking                                               [">]

What I am interested in is extracting the regular expression for each unique tag. Namely this list:

attack_tags = {'sqlite', 'css', 'spam', 'mongo', 'sqli', 'dos', 'mssql', 'xss', 'mysql', 'php', 'tsql', 'pgsql', 'lfi', 'win', 'id', 'rfi', 'xxe', 'unix', 'bash', 'rce', 'perl', 'ldap'}

I tried the following code but it did not work:

for category in attack_tags:
  
    rules = list(df.query('{} in df[\'tags\']'.format(category))) # select rule from dataframe where  current_category (category) is in tags
    print(rules) # This should be a list that contains all the rules where the attack category is in df['tags'] column.

I am getting a KeyError: 'current_category' # for instance KeyError: 'mongo' or 'php'

Any recommendation ?

you should provide the dataset as **text** and the expected output — mozway, Jan 27 '22 at 13:57
Please add a `.head()` of your dateframe instead of a picture to make debugging easier. — Tzane, Jan 27 '22 at 13:57
Does this answer your question? [Filter pandas DataFrame by substring criteria](https://stackoverflow.com/questions/11350770/filter-pandas-dataframe-by-substring-criteria) — Tzane, Jan 27 '22 at 14:43

select column from panda dataframe where a given string is contained in another column that contains a list

0 Answers0