1

Background: I'm using a Perl script that submits abuse reports to abuseipdb.com. The script has only one default category (14 Port Scan), but I want to submit the correct category/categories for the abuse reports. Submitting the category is done by a number. Multiple categories are possible and are separated by a comma ','.

List of categories: https://www.abuseipdb.com/categories

Source of the script: https://www.abuseipdb.com/csf, scroll roughly halfway down for "abuseipdb_report.pl".

I've modified the script to scan the log files for keywords and if found give it a correct category number. (Disadvantage: only 1 category number can be used this way.)

It's working, but far from pretty. All those if and elsif statements will take a lot of processing time.

Here are the snippets that I've cooked up. (it's working!)

my $cat = '14';
my $logs = $ARGV[6];
if    ($logs =~ m/DOS-PROTECTION/)      {$cat = '4';}
elsif ($logs =~ m/PROTOCOL-ENFORCEMENT/){$cat = '15';}
elsif ($logs =~ m/PROTOCOL-ATTACK/)     {$cat = '15';}
elsif ($logs =~ m/DATA-LEAKAGES/)       {$cat = '16';}
elsif ($logs =~ m/IP-REPUTATION/)       {$cat = '19';}
elsif ($logs =~ m/SCANNER-DETECTION/)   {$cat = '19';}
elsif ($logs =~ m/APPLICATION-ATTACK/)  {$cat = '21';}
elsif ($logs =~ m/METHOD-ENFORCEMENT/)  {$cat = '23';}

my $data = {
    ip => $ARGV[0],
    comment => $comment,
    categories => $cat
};

I've tried arrays and foreach loops, but I'm a novice in Perl and can't get that to work. So I'm stuck with my if elsif code.

So now you have an idea what I'm trying to accomplish.

Is there a smarter faster way, possibly including multiple categories?

A sample of $logs:

2020/08/07 06:25:11 [error] 16769#0: *40996 [client 174.xxx.xxx.185] ModSecurity: Access denied with code 406 (phase 2). Matched "Operator `Within' with parameter `.asa/ .asax/ .ascx/ .axd/ .backup/ .bak/ .bat/ .cdx/ .cer/ .cfg/ .cmd/ .com/ .config/ .conf/ .cs/ .csproj/ .csr/ .dat/ .db/ .dbf/ .dll/ .dos/ .htr/ .htw/ .ida/ .idc/ .idq/ .inc/ .ini/ .key/ .licx/ .ln (150 characters omitted)' against variable `TX:EXTENSION' (Value: `.bak/' ) [file "/etc/modsecurity.d/REQUEST-920-PROTOCOL-ENFORCEMENT.conf"] [line "1015"] [id "920440"] [rev ""] [msg "URL file extension is restricted by policy"] [data ".bak"] [severity "2"] [ver "OWASP_CRS/3.3.0"] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-protocol"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/210/272"] [tag "PCI/6.5.10"] [hostname "46.xxx.xxx.137"] [uri "/wp-config.php.bak"] [unique_id "159678151143.002383"] [ref "o13,4o14,3v5,17o35,5t:urlDecodeUni,t:lower
 case"], client: 174.xxx.xxx.185, server: <removed>, request: "GET /wp-config.php.bak HTTP/1.1", host: "<removed>", referrer: "http://<removed>/"

Second $logs sample:

2020/08/07 06:52:14 [error] 16769#0: *42613 [client 195.xxx.xxx.89] ModSecurity: Access denied with code 406 (phase 2). Matched "Operator `Rx' with parameter `(?i)(?:\x5c|(?:%(?:c(?:0%(?:[2aq]f|5c|9v)|1%(?:[19p]c|8s|af))|2(?:5(?:c(?:0%25af|1%259c)|2f|5c)|%46|f)|(?:(?:f(?:8%8)?0%8|e)0%80%a|bg%q)f|%3(?:2(?:%(?:%6|4)6|F)|5%%63)|u(?:221[56]|002f|EFC8|F025)|1u|5 (400 characters omitted)' against variable `REQUEST_URI_RAW' (Value: `/forum/../forum/index.php' ) [file "/etc/modsecurity.d/REQUEST-930-APPLICATION-ATTACK-LFI.conf"] [line "29"] [id "930100"] [rev ""] [msg "Path Traversal Attack (/../)"] [data "Matched Data: /../ found within REQUEST_URI_RAW: /forum/../forum/index.php"] [severity "2"] [ver "OWASP_CRS/3.3.0"] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-lfi"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/255/153/126"] [hostname "46.xxx.xxx.137"] [uri "/forum/../forum/index.php"] [unique_id "15967831
 3492.204063"] [ref "o6,4v4,25"], client: 195.xxx.xxx.89, server: <removed>, request: "GET /forum/../forum/index.php HTTP/1.1", host: "<removed>"

Now I'm matching the .conf files from modsecurity. But I might be better to check the "tag" comments.


Solution
With the help of the answers here given I've got this solution that works for me. Only thing left to do for me is fine tuning the array.

my $logs = "attack-protocol attack-reputation-scanner attack PROTOCOL-ENFORCEMENT   ";
my %categories = (
    'DOS-PROTECTION'        =>  4,
    'PROTOCOL-ENFORCEMENT'  => 15,
    'PROTOCOL-ATTACK'       => 15,
    'DATA-LEAKAGES'         => 16,
    'IP-REPUTATION'         => 19,
    'SCANNER-DETECTION'     => 19,
    'APPLICATION-ATTACK'    => 21,
    'METHOD-ENFORCEMENT'    => 23,
    'attack-lfi'            => 10,
    'attack-protocol'       => 11,
    'attack-reputation-scanner' => 12,
    'attack'                => 15,
);

my @cats = ();
for (keys %categories) {
  if ($logs =~ /$_/) {
    push @cats, $categories{$_};
  }
}

my %hash = map { $_ => 1 } @cats;
@cats = keys %hash;

print categories => join ',', @cats,;
Karel
  • 301
  • 2
  • 10

2 Answers2

4

The usual way to do these things is to join together all the tokens you're looking for into a single alternation which can be matched once. For example

my %keys = (
    'DOS-PROTECTION', 4,
    'PROTOCOL-ENFORCEMENT', 15,
    # ....
);
my $pattern = join '|', map quotemeta, sort keys %keys;
$pattern = qr/($pattern)/;

while (<DATA>) {
    my $cat = 14;
    $cat = $keys{$1} if /$pattern/;
    print "[$cat]\n";
}

__DATA__
blah blah DOS-PROTECTION blah blah
blah blah PROTOCOL-ENFORCEMENT blah blah
blah blah
blah blah
Dave Mitchell
  • 2,193
  • 1
  • 6
  • 7
  • Did not had any success with your code **yet**. Do I need to substitute with $logs? – Karel Aug 07 '20 at 11:02
  • My sample code was just a demo - reading in log lines from the end of the file. Repace the while loop with however you get your lines of log file. If the line is in $logs, then change the condition to ... if $logs =~ /$pattern/. If $logs contains multiple lines you'll have to do things a bit differently. – Dave Mitchell Aug 07 '20 at 12:21
2

One obvious improvement is to store your categories in an array. That way, you can submit multiple categories:

my @cats = (14);

my $logs = $ARGV[6];
if    ($logs =~ m/DOS-PROTECTION/)          { push @cats, 4;}
elsif ($logs =~ m/PROTOCOL-ENFORCEMENT/)    { push @cats, 15;}
# etc...

my $data = {
  ip => $ARGV[0],
  comment => $comment,
  categories => join ',', @cats,
};

I'd also consider a data-driven approach where the strings you're matching are stored in a hash along with the associated categories.

my %categories = (
  'DOS-PROTECTION'       =>  4,
  'PROTOCOL-ENFORCEMENT' => 15,
  # etc...
);

You can then use a loop for the checks:

my @cats = (14);

my $logs = $ARGV[6];

for (keys %categories) {
  if ($logs =~ /$_/) {
    push @cats, $categories{$_};
  }
}
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
  • I've got your code working! I took the data-driven approach. Tanks! Is there an option to remove duplicate categories? Unfortunately the receiving party is not filtering out duplicate categories. – Karel Aug 07 '20 at 11:00
  • Found a way to remove duplicates: ```` my %hash = map { $_ => 1 } @cats; @cats = keys %hash; ```` Source: https://stackoverflow.com/a/12921325 – Karel Aug 07 '20 at 12:27
  • 2
    @karel It's simpler to skip the array and store the categorizes in a set as the keys of a hash directly. `$cats{ $categories{$1} } = 1` to store and `keys %cats` to get the unique categories. – Schwern Aug 08 '20 at 06:01