I've formatted your query... if we break it down you seem to be applying a couple of functions inet_ntoa
. If you don't have a pressing need for then get rid of them ( especially if they look at a table ).
Secondly, if we look at your query you are doing a full scan of DB
3 times for your various counts, and then at the very minimum a range scan in your top level select.
SELECT sid, cid, timestamp, sig_name, inet_ntoa(ip_src), layer4_sport, inet_ntoa(ip_dst), layer4_dport
FROM DB
WHERE ( ip_dst IN ( SELECT ip_dst
FROM DB
GROUP BY ip_dst
HAVING count(*) > 1 )
AND timestamp IN ( SELECT timestamp
FROM DB
GROUP BY timestamp
HAVING count(*) > 1 )
AND layer4_dport IN ( SELECT layer4_dport
FROM DB
GROUP BY layer4_dport
HAVING count(*) > 1 )
)
By not linking your subquery back to the main table, you've assumed that ip_dst
, timestamp
and layer4_dport
are each unique across the whole table and then are trying to find where the unlikely occurrence of 3 independently unique values happened to have duplicates in the same row.
I suspect what you want to do is something like the following:
SELECT a.sid, a.cid, a.ip_dst, a.timestamp, a.sig_name, a.layer4_sport, a.layer4_dport
FROM DB a
JOIN ( SELECT timestamp, layer4_dport
FROM DB
GROUP BY timestamp, layer4_dport
HAVING count(*) > 1 ) b
ON a.timestamp = b.timestamp
AND a.layer4_dport = b.layer4_dport
This finds you all the rows where there are more than 1 identical timestamp
and layer4_dport
combinations as per your question.
If you want to find all the duplicates at the level of ip_dst
then you need to add this to your sub-query.