Most efficent way of determing if a value is in a table

Question

I often run into the situation where I want to determine if a value is in a table. Queries often happen often in a short time period and with similar values being searched therefore I want to do this the most efficient way. What I have now is

if($statment = mysqli_prepare($link, 'SELECT name FROM inventory WHERE name = ? LIMIT 1'))//name and inventory are arbitrarily chosen for this example
{
    mysqli_stmt_bind_param($statement, 's', $_POST['check']);
    mysqli_stmt_execute($statement);
    mysqli_stmt_bind_result($statement, $result);
    mysqli_stmt_store_result($statement);//needed for mysqli_stmt_num_rows
    mysqli_stmt_fetch($statement);
}
if(mysqli_stmt_num_rows($statement) == 0)
        //value in table
else
        //value not in table

Is it necessary to call all the mysqli_stmt_* functions? As discussed in this question for mysqli_stmt_num_rows() to work the entire result set must be downloaded from the database server. I'm worried this is a waste and takes too long as I know there is 1 or 0 rows. Would it be more efficient to use the SQL count() function and not bother with the mysqli_stmt_store_result()? Any other ideas?

I noticed the prepared statement manual says "A prepared statement or a parametrized statement is used to execute the same statement repeatedly with high efficiency". What is highly efficient about it and what does it mean same statement? For example if two separate prepared statements evaluated to be the same would it still be more efficient?

By the way I'm using MySQL but didn't want to add the tag as a solution may be non-MySQL specific.

possible duplicate of http://stackoverflow.com/questions/8263371/how-prepared-statements-can-protect-from-sql-injection-attacks — Dave Chen, Jun 17 '13 at 20:16
@DaveChen which part of the article you linked to contains duplicate information? — Celeritas, Jun 17 '13 at 20:32
`What is highly efficient about it`, it does make sense, the SQL `parser` parses less information and more data. — Dave Chen, Jun 17 '13 at 20:36
Why/how does the SQL parser parse less information and more data? — Celeritas, Jun 17 '13 at 21:30

score 2 · Answer 1 · answered Jun 17 '13 at 20:34

if($statment = mysqli_prepare($link, 'SELECT name FROM inventory WHERE name = ? LIMIT      1'))//name and inventory are arbitrarily chosen for this example
{
    mysqli_stmt_bind_param($statement, 's', $_POST['check']);
    mysqli_stmt_execute($statement);
    mysqli_stmt_store_result($statement);
}
if(mysqli_stmt_num_rows($statement) == 0)
    //value not in table
else
    //value in table

I believe this would be sufficient. Note that I switched //value not in table and //value in table.

pinepain · Answer 2 · 2013-06-17T22:01:24.113

It really depends of field type you are searching for. Make sure you have an index on that field and that index fits in memory. If it does, SELECT COUNT(*) FROM <your_table> WHERE <cond_which_use_index> LIMIT 1. The important part is LIMIT 1 which prevent for unnecessary lookup. You can run EXPLAIN SELECT ... to see which indexes used and probably make a hint or ban some of them, it's up to you. COUNT(*) works damn fast, it is optimized by design return result very quickly (MyISAM only, for InnoDB the whole stuff is a bit different due to ACID). The main difference between COUNT(*) and SELECT <some_field(s)> is that count doesn't perform any data reading and with (*) it doesn't care about whether some field is a NULL or not, just count rows by most suitable index (chosen internally). Actually I can suggest that even for InnoDB it's a fastest technique.

Also use case matters. If you want insert unique value make constrain on that field and use INSERT IGNORE, if you want to delete value which may not be in table run DELETE IGNORE and same for UPDATE IGNORE.

Query analyzer define by itself whether two queries are the same on or not and manage queries cache, you don't have to worry about it.

The different between prepared and regular query is that the first one contains rule and data separately, so analyzer can define which data is dynamic and better handle that, optimize and so. It can do the same for regular query but for prepared we say that we will reuse it later and give a hint which data is variable and which is fixed. I'm not very good in MySQL internal so you can ask such questions on more specific sites to understand details in a nutshell.

P.S.: Prepared statements in MySQL are session global, so after session they are defined in ends they are deallocated. Exact behavior and possible internal MySQL caching is a subject of additional investigation.

score -1 · Answer 3 · answered Jun 17 '13 at 20:43

-1

This is the kind of things in-memory caches are really good at. Something like this should work better than most microoptimization attempts (pseudocode!):

function check_if_value_is_in_table($value) {
  if ($cache->contains_key($value)) {
    return $cache->get($value);
  }
  // run the SQL query here, put result in $result
  // note: I'd benchmark if using mysqli_prepare actually helps
  // performance-wise
  $cache->put($value, $result);
  return $result;
}

Have a look at memcache or the various alternatives.

answered Jun 17 '13 at 20:43

CAFxX

28,060
6
41
66

I'm not a downvoter but it looks like not a trivial task to keep memcache and mysql at a consistent condition, especially if you have few web machines and few replica set. At second, the original question was not exactly about speedup techniques, so caching is looks like a next step optimization and in this situation is not on of the best, just imagine 10k possible keys 16char length each (even let it be 8bit per char). It's better to put in this memory db index than memcache keys. – pinepain Jun 17 '13 at 21:28

Most efficent way of determing if a value is in a table

3 Answers3