I am attempting to use a dialect of SQL to parse a set of records. Specifically I want to count the number records that contain various ids (words) occur in my logs. I also want to count the total number of times these ids are present.
The data is present in two different tables:
- id_status - A table contain
id
,status
. - request_records - A table which contains request records. An
id
might be present more than once in a given record.
See SQL Fiddle!
id_status
This table contains the ids and their current status.
id, status
sessions, ACTIVE
visits, DEPRECATED
duration, ACTIVE
...
request_records
Sentence record may contain over 300k records:
request
example.com/api?foo=sessions
example.com/api?bar=session%2Cvisits,foo=sessions
example.com/api?bar=duration,visits
example.com/api?foo=sessions
example.com/api?foo=visits,bar=visits
...
Goal
I want to create a table containing four columns id,status,occurence_count,record_count
.
occurence_count
: should be the total number of times an id occurs in all the records.recourd_count
: should be the the total number of records a particularid
occurs in.
As a result from the above example I would generate the following table:
id, status, occurence_count, recourd_count
sessions,ACTIVE,3,2
visits,DEPRECATED,4,3
duration,ACTIVE,1,1
SQL
Is there a way to construct an SQL query which will return the desired table? I would know how to do it fairly readily in Python or any other language but if possible I'd like to write a SELECT
statement to generate this table.
Update: As suggested below, I have attempted to execute the following:
SELECT ids.id, ids.status,
SUM(length(request) - LENGTH(replace(request, ids.id, ''))) / LENGTH(ids.id) as occurence_count,
COUNT(reqs.request) AS recourd_count
FROM id_status ids LEFT JOIN
request_records reqs
ON find_in_set(ids.id, reqs.requests) > 0
GROUP BY ids.id, ids.status;
However my dialect of SQL does not permit this type of join and returns the following error:
ON clause must be AND of = comparisons of one field name from each table, with all field names prefixed with table name.
Added SQL Fiddle with Sample Schema:
CREATE TABLE id_status
(`id` varchar(32), `status` varchar(32))
;
INSERT INTO id_status
(`id`, `status`)
VALUES
('sessions', 'ACTIVE'),
('visits', 'DEPRECATED'),
('duration', 'ACTIVE')
;
CREATE TABLE request_records
(`request` varchar(500))
;