I realize this is a sizable query and there's obviously a lot going on. The 2nd to last line of this query compares MD5'd hashes of text
fields. I tried this instead of comparing the values, but it didn't reduce the query execution time at all.
Fields in sob_datas
can be linked to each other via sob_datas_to_links
and this query compares the values for 2 forms (each have 517 rows in sob_datas
) and pulls the linked ones that do not share the same values.
SELECT
sob_datas.id,
sob_datas.sob_field_name
FROM sob_datas
WHERE sob_form_id = '.$formId.' AND
EXISTS(SELECT
sob_datas_to_links.id
FROM sob_datas_to_links
INNER JOIN sob_datas AS sub_sob_datas ON
sub_sob_datas.id = sob_datas_to_links.sob_datas_id
INNER JOIN sob_forms ON (
sob_forms.id = sub_sob_datas.sob_form_id AND
sob_forms.is_proof = 0 AND
sob_forms.archived IS NULL
)
WHERE sob_datas_to_links.link_id = (
SELECT
link_id
FROM sob_datas_to_links AS sub_sob_datas_to_links
WHERE sub_sob_datas_to_links.sob_datas_id = sob_datas.id
) AND
sub_sob_datas.sob_field_name = sob_datas.sob_field_name AND
sub_sob_datas.hash != sob_datas.hash
)
.
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY sob_datas ref sob_form_id,sob_form_id_2 sob_form_id [BINARY - 1B] const 563 Using where
2 DEPENDENT SUBQUERY sob_forms ALL PRIMARY NULL NULL NULL 937 Using where
2 DEPENDENT SUBQUERY sub_sob_datas ref PRIMARY,sob_form_id,sob_form_id_2 sob_form_id [BINARY - 3B] summaries_dev.sob_forms.id,summaries_dev.sob_datas... 1 Using where
2 DEPENDENT SUBQUERY sob_datas_to_links ref sob_datas_id sob_datas_id [BINARY - 1B] summaries_dev.sub_sob_datas.id 1 Using where
3 DEPENDENT SUBQUERY sub_sob_datas_to_links ref sob_datas_id sob_datas_id [BINARY - 1B] summaries_dev.sob_datas.id 1 Using where