0

UPDATE : For anyone reading this , as @dnoeth stated below indexing makes a huge difference when it comes to query time. In some cases it cut row scans down to 1 row. You just have to pick the flavor of indexing that suits you best . Clustered vs Non Clustered

Index example:

E.g: create index ix_descr on my_db.media (description asc)

I have a query that is taking a long time to run, it has has quite a number of functions in the where clause. What is the best approach to optimizing my query ?

Moving functions out of where clause and replacing them with joins,etc ?

I've given a few simple examples below, FIG 2 and 3 are my optimatization attempts, let me know your thoughts.

Note* these aren't the actual functions I am using, I use dbo.DecryptBlob(m.mediaId) etc. I chose to use the functions below to make the question easier to grasp.

FIG 1.)

SELECT 
    description, id
FROM
    my_db.media
where 
    length(description) = 10
    and description like "S%"
    and trim(left(m.description,2)) = 'St'

POSSIBLE OPTIMIZATIONS

FIG 2.) (Using join and group by)

SELECT
    m.description, m.mediaId
FROM
    my_db.media m
        inner JOIN
    my_db.media m1 ON length(m.description) = 10 
        and m.description like "S%"
        and trim(left(m.description,2)) = 'St'
group by m.mediaId

FIG 3.) (Using join and distinct)

SELECT distinct
    m.description, m.mediaId
FROM
    my_db.media m
        inner JOIN
    my_db.media m1 ON length(m.description) = 10 
        and m.description like "S%"
        and trim(left(m.description,2)) = 'St'
Fabii
  • 3,820
  • 14
  • 51
  • 92
  • Why the downvote ? Please elaborate . Should I phrase my question differently so it elicits more of an answer rather than opinions ? – Fabii Jul 26 '17 at 15:36
  • 1
    You can try https://dev.mysql.com/doc/refman/5.7/en/using-explain.html – quAnton Jul 26 '17 at 15:38
  • 1
    Adding a self-join increases complexity, it doesn't matter if a condition is palced in WHERE or ON, I doubt there's improvement, should be way less efficient. Better *index* your data. Btw, you can remove `LENGTH` and the stupid `TRIM(LEFT)` and simplify to a single condition: `WHERE description like "St________"` – dnoeth Jul 26 '17 at 15:38
  • @dnoeth thanks for the suggestions. – Fabii Jul 26 '17 at 15:47

2 Answers2

0

Your starting comment I phrase as "Don't hide an indexed column inside a function, the Optimizer can't find it to use the INDEX."

and description like "S%"
and trim(left(m.description,2)) = 'St'

is the same as

and description LIKE "St%"

With that, INDEX(description) will work nicely. The check for length=10 is secondary, and not a real problem. It will do a "range" scan of that index. (If you did not mean '2' and 'St', then my comment may not apply.)

TRIM -- Clean up the input rather than trying to do it during the SELECT.

In this situation, using a JOIN will show things down.

It is "wrong" to do group by m.mediaId without specifying mediaId in the SELECT list. So, I don't know what you were aiming at in Fig.2.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • What if I have no choice but to use a function on an indexed column say dbo.DecryptBlob(clientId) etc, what the best work around for a case like that ? – Fabii Jul 27 '17 at 14:13
  • "dbo"? This is MySQL. – Rick James Jul 27 '17 at 22:30
  • yes that is correct, lets say for syntax and semantics purpose we have the MySQL equivalent of dbo.DecryptBlob(), what work around could I use in that case. – Fabii Jul 28 '17 at 20:35
0

If your application frequently selects by the length of the description, you should consider adding a column that contains the length of the description and creating an index on it.

bikeman868
  • 2,236
  • 23
  • 30
  • I more so asking about the best way to tackle moving functions out of a where clause, whether it more beneficial to do joins, index etc. @Dnoeth noted that joins increase the complexity of the query, I removed the join and indexed the columns and noticed and increase in performance. – Fabii Jul 27 '17 at 14:20
  • Sometimes people are very focussed on the solutions that jump out at them, and don't step back and look at other possibilities. In general if people come across this thread I wanted to point out that the options you were considering are not the only ways to solve this type of performance problem. – bikeman868 Jul 27 '17 at 15:22