Hi all I have a MySQL table that has a field of comma separated values
id res
=============================
1 hh_2,hh_5,hh_6
------------------------------
2 hh_3,hh_5,hh_4
------------------------------
3 hh_6,hh_8,hh_7
------------------------------
4 hh_2,hh_7,hh_4
------------------------------
Please see the above example ,Actually i need to compare each row 'res' with other row's 'res' values and need to display count if they match with others. Please help me to get the count.
For example, IN first row 'hh_2' also exist in fourth row so we need count as 2, likewise we need to compare all in all rows
I Have run the function its working for me. but the table so big. It have million of records so my performance take time. While check one record with 50000 record take 25 sec. Suppose my input is 60 rows it take one hour. Please help me how to optimize.
CREATE FUNCTION `combine_two_field`(s1 CHAR(96), s3 TEXT) RETURNS int(11)
BEGIN
DECLARE ndx INT DEFAULT 0;
DECLARE icount INT DEFAULT 0;
DECLARE head1 char(10);
DECLARE head2 char(10);
DECLARE head3 char(10);
WHILE ndx <= LENGTH(s1) DO
SET head1 = SUBSTRING_INDEX(s3, ',', 1);
SET s3 = SUBSTRING(s3, LENGTH(head1) + 1 + @iSeparLen);
SET head2 = SUBSTRING_INDEX(s1, ',', 1);
SET s1 = SUBSTRING(s1, LENGTH(head2) + 1 + @iSeparLen);
IF (head1 = head2) THEN
SET icount = icount + 1;
END IF;
SET ndx = ndx + 1;
END WHILE;
RETURN icount;
END
And the table size is too big and i want to reduce fetching time also ...
UPDATE QUERY:
DROP PROCEDURE IF EXISTS `pcompare7` $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `pcompare7`(IN in_analysis_id INT(11))
BEGIN
drop table if exists `tmp_in_results`;
CREATE TEMPORARY TABLE `tmp_in_results` (
`t_id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
`r_id` bigint(11) NOT NULL,
`r_res` char(11) NOT NULL,
PRIMARY KEY (`t_id`),
KEY r_res (r_res)
)
ENGINE = InnoDB;
SELECT splite_snp(r_snp,id,ruid) FROM results WHERE technical_status = 1 and critical_status = 1 and autosomal_status = 1 and gender_status != "NO CALL" and analys_id = in_analysis_id;
-- SELECT * FROM tmp_in_results;
-- COmpare Functionality
SELECT a.t_id, b.id, SUM(IF(FIND_IN_SET(a.r_res, b.r_snp), 1, 0)) FROM tmp_in_results a CROSS JOIN results b GROUP BY a.t_id, b.id;
END $$
Function FOR CREATE TEMP TABLE:
DROP FUNCTION IF EXISTS `splite_snp` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `splite_snp`(s1 TEXT, in_id bigint(96), ruid char(11)) RETURNS tinyint(1)
BEGIN
DECLARE ndx INT DEFAULT 0;
DECLARE icount INT DEFAULT 0;
DECLARE head1 TEXT;
DECLARE head2 TEXT;
DECLARE intpos1 char(10);
DECLARE intpos2 char(10);
DECLARE Separ char(3) DEFAULT ',';
DECLARE iSeparLen INT;
SET @iSeparLen = LENGTH( Separ );
WHILE s1 != '' DO
SET intpos1 = SUBSTRING_INDEX(s1, ',', 1);
SET s1 = SUBSTRING(s1, LENGTH(intpos1) + 1 + @iSeparLen);
INSERT INTO tmp_in_results(r_id,r_res) VALUES(in_id,intpos1);
END WHILE;
RETURN TRUE;
END $$
New table structure
pc_input
id in_res in_id
=============================
1 hh_2 1000
------------------------------
2 hh_3 1000
------------------------------
3 hh_6 1001
------------------------------
4 hh_2 1001
------------------------------
res_snp
id r_res r_id
=============================
1 hh_2 999
------------------------------
2 hh_3 999
------------------------------
3 hh_9 999
------------------------------
4 hh_2 998
------------------------------
5 hh_6 998
------------------------------
6 hh_9 998
------------------------------
Result:
in_id r_id matches_count
=============================
1000 999 2 (hh_2,hh_3)
------------------------------
1000 998 1 (hh_2)
------------------------------
1001 999 1 (hh_2)
------------------------------
1001 998 2 (hh_2,hh_6)
------------------------------
I have add the separate index both table in_res,in_id and r_res and r_id
QUERY:
SELECT b.r_id,count(*) FROM pc_input AS a INNER JOIN results_snps AS b ON (b.r_snp = a.in_snp) group by a.in_id,b.r_id;
But mysql server was freeze. Cloud you please suggest any other way or optimize my query.
EXPLAIN TABLE: res_snp
Field Type Null Key Default Extra
id bigint(11) NO PRI NULL auto_increment
r_snp varchar(50) NO MUL NULL
r_id bigint(11) NO MUL NULL
EXPLAIN TABLE: pc_input
Field Type Null Key Default Extra
id bigint(11) NO PRI NULL auto_increment
in_snp varchar(55) NO MUL NULL
in_id bigint(11) NO MUL NULL
Explain Query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ALL in_snp NULL NULL NULL 192 Using temporary; Using filesort
1 SIMPLE b ref r_snp r_snp 52 rutgers22042014.a.in_snp 2861 Using where0