1

Currently I'm using the following query: (I'm using the following implmenetation of levenshtein()

SELECT title FROM db_full WHERE MATCH (`Title`) AGAINST ('Harry Potter and the Deathly Hallows' IN BOOLEAN MODE) AND levenshtein('Harry Potter and the Deathly Hallows', `Title`) BETWEEN 0 AND 15

I was wondering if it was faster to use the combination of MySQL + PHP something like below

$testQ = "SELECT title FROM db_full WHERE MATCH (`Title`) AGAINST ('Harry Potter and the Deathly Hallows' IN BOOLEAN MODE)";
} 
result = $conn->query($testQ);

while ($row = $result->fetch_assoc()) {
   $distance = levenshtein($v,$row['title']); 

 // using the if condinational for the distance
}

Which procedure would be faster and more efficient?

Community
  • 1
  • 1

1 Answers1

1

In general, the correct answer to "Which is faster" is "Try it on your system and your data". In this case, that is the best answer. I do not know off-hand which system (MySQL or PHP) has a faster implementation of Levenshtein. Probably nobody does, because this depends on system configurations.

There are reasons to think that doing the work in the database would be faster:

  • Database servers tend to be more powerful than machines running applications.
  • Running the code in the database will reduce the volume of data being passed from the server to the application.

From a maintainability (aesthetic ?) perspective, I prefer to have all logic co-located, so that is another argument for doing the work in the database. However, that has little to do with performance.

However, it is possible that the volume of data being passed back is pretty small and that PHP has a better implementation of Levenshtein than does MySQL. You should test the two approaches if marginal performance improvement is important for your application.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you! I'll compare both and have a look. An important point being that PHP implementation of levenshtein might be faster since it is an inbuilt function –  May 13 '15 at 10:58
  • Note: I've tested both the methods. The levenshtein on the database is terribly slow. 4.102 seconds for a DB with 1.9Million entries. The levenshtein PHP was much faster with 0.515 seconds . I think that gives us our winner. –  May 13 '15 at 11:08