This is the first query I was working with. The query took way too long to process. This will be processing millions of rows in the table.
select date(datetime), count(*) from LOG l1
where ((datetime >= str_to_date('2013-01-01 00:00:00','%Y-%m-%d %H:%i:%s'))
and (datetime < str_to_date('2013-01-30 00:00:00','%Y-%m-%d %H:%i:%s'))
and code_subcode in ('8008118', '8008218', '8008318', '8008418'))
and (
select count(*) from log l2
where l2.session_id = l1.session_id and l2.remote_ip = l1.remote_ip and code_subcode
in('8008119', '8008219', '8008319', '8008419')
and TIMEDIFF(l2.datetime, l1.datetime) < 1)
group by date(datetime) order by date(datetime);
After some reading I found that joins are supposed to be faster than nested-selects so I converted it to this...
SELECT date(l1.datetime), count(*) from LOG l1
INNER JOIN LOG l2 ON l2.SESSION_ID=l1.SESSION_ID
where ((l1.datetime >= str_to_date('2013-01-01 00:00:00','%Y-%m-%d %H:%i:%s'))
and (l1.datetime < str_to_date('2013-01-30 00:00:00','%Y-%m-%d %H:%i:%s'))
and l1.code_subcode in ('8008118', '8008218', '8008318', '8008418')
and l2.code_subcode in('8008119', '8008219', '8008319', '8008419')
and date(l1.datetime) = date(l2.datetime)
and TIMEDIFF(l2.datetime, l1.datetime) < 1)
GROUP BY date(l1.datetime)
ORDER BY DATE(l1.datetime);
Please help, thanks.
Thank you all for your help. It turns out that the reason it is taking so long to execute is because the program is silently failing on the server. When executing the query on my local environment it processes 800,000 lines in 48 seconds. If I have time I'll look into farther optimization, for now I have too much crap to get done.
Once again, thanks!