I'm currently using a utf8 mysql database. It checks if a translation is already in the database and if not, it does a translation and stores it in the database.
SELECT * FROM `translations` WHERE `input_text`=? AND `input_lang`=? AND `output_lang`=?;
(The other field is "output_text".) For a basic database, it would first compare, letter by letter, the input text with the "input_text" "TEXT" field. As long as the characters are matching it would keep comparing them. If they stop matching, it would go onto the next row.
I don't know how databases work at a low level but I would assume that for a basic database, it would search at least one character from every row in the database before it decides that the input text isn't in the database.
Ideally the input text would be converted to a hash code (e.g. using sha1) and each "input_text" would also be a hash. Then if the database is sorted properly it could rapidly find all of the rows that match the hash and then check the actual text. If there are no matching hashes then it would return no results even though each row wasn't manually checked.
Is there a type of mysql storage engine that can do something like this or is there some additional php that can optimize things? Should "input_text" be set to some kind of "index"? (PRIMARY/UNIQUE/INDEX/FULLTEXT)
Is there an alternative type of database that is compatible with php that is far superior than mysql?
edit: This talks about B-Tree vs Hash indexes for MySQL:
http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html
None of the limitations for hash indexes are a problem for me. It also says
They are used only for equality comparisons that use the = or <=> operators (but are very fast)
["very" was italicized by them]
NEW QUESTION:
How do I set up "input_text" TEXT to be a hash index? BTW multiple rows contain the same "input_text"... is that alright for a hash index?
http://dev.mysql.com/doc/refman/5.5/en/column-indexes.html
Says "The MEMORY storage engine uses HASH indexes by default" - does that mean I've just got to change the storage engine and set the column index to INDEX?