In a previous question the awesome amdixon was able to come up with a query to calculate the level of repetition of IPs.
I've adapted this to look at specific accounts using WHERE earning_account_id = ?
:
SELECT MAX(repeat_factor)
FROM
(
SELECT earning_ip, count(*) / rc.row_count AS repeat_factor
FROM earnings
CROSS JOIN (SELECT count(*) AS row_count FROM earnings WHERE earning_account_id = ?) rc
WHERE earning_account_id = ?
GROUP BY earning_ip
) q
However, now I want to add an additional level of security.
I want to apply the same sort of query. But instead of limiting it to an earning_account_id, I want to limit it to any account grouping that has a sighting of a specific ip address.
This is so I can have better global detection of proxy spam, if they use multiple alt accounts.
Please note that I will no longer be limiting the query using WHERE earning_account_id = ?
So in other words if the ip_address was "45.55.80.86"
+--------------------+-------------+---------------------------+
| earning_account_id | earning_ip | select row for repeat_factor query?|
+--------------------+-------------+---------------------------+
| 1 | 45.55.80.86 | YES |
| 1 | 45.55.80.86 | YES |
| 2 | 1.22.83.65 | NO |
| 2 | 91.15.76.37 | NO |
| 3 | 45.55.80.86 | YES |
| 4 | 61.25.76.37 | YES |
| 4 | 1.22.83.65 | YES |
| 4 | 45.55.80.86 | YES |
| 5 | 61.25.76.37 | NO |
+--------------------+-------------+---------------------------+
The value to be returned would be the repeat_factor for all earnings of this ip, but ignoring all accounts that have never included this ip address.
In other words, what I'm trying to find out is:
"how repetitive is this IP address within all accounts, but looking only at the accounts where this IP address has been sighted?"