In trying to rejuvinate code I wrote mostly 14+ years ago. I've come to see that the lovely little setup I wrote then was... lacking in certain places, namely handling user inputs.
Lesson: Never underestimate users ability to inject trash, typos, and dupes past your validators.
The old way is reaching critical mass as there are 470 items in a SELECT dropdown now. I want to reinvent this part of the process so I don't have to worry about it hitting a breaking point.
So the idea is to build a fuzzy search method so that after the typist enters the search string, we check against five pieces of data, all of which reside in the same row.
I need to check the name submitted against the Stage Name, two also-known-as names, as well as their legal name and as a final check against a soundex() index based on their Stage Name (this catches a few spelling errors missed otherwise)
I've tried a complicated block of code to check these things (and it doesn't work, mostly because I think I coded the comparisons too strict) as part of a do/while loop.
In the below, var $Rin
would contain the user supplied name.
$setr = mysql_query("SELECT ID,StageName,AKA1,AKA2,LegalName,SoundEx FROM performers");
IF ($R = mysql_fetch_array($setr)) {
do {
$RT = substr(trim($Rin), 5);
$RT1 = substr($R[1], 5);
$RT2 = substr($R[2], 5);
$RT3 = substr($R[3], 5);
$RT4 = substr($R[4], 5);
$RTx = soundex($RT);
IF ($RT == $RT1) {
$RHits[] = $R[0];
}
IF ($RT == $RT2) {
$RHits[] = $R[0];
}
IF ($RT == $RT3) {
$RHits[] = $R[0];
}
IF ($RT == $RT4) {
$RHits[] = $R[0];
}
IF ($RTx == $R[5]) {
$RHits[] = $R[0];
}
} while ($R = mysql_fetch_array($setr));
}
The idea being that I'll build an array of the ID#'s of the near hits, which I'll populate into a select dropdown that has only hopefully fewer hits that the whole table. Which means querying for a result set from the contents of that array, in order to display the Performer's name in the SELECT dropdown and pass the ID# as the value for those choices.
Thats when I hit the 'I need to use an array in my WHERE clause' problem, and after finding that answer, I am starting to suspect I'm out of luck due to Stipulation #2 below. So I started looking at alternate search methods and I'm not sure I've gotten anywhere but more confused.
So, is there a better way to scan a single table for six fields, checking five against user input and noting the sixth for display in a subset of the original table?
Thought process:
Against the whole table, per record, test $Rin against these tests in this order:
$Rin -> StageName
$Rin -> AKA1
$Rin -> AKA2
$Rin -> LegalName
soundex($Rin) -> SoundEx
where a hit on any of the five operations adds the ID# to a result array that is used to narrow the results from 470 performers down to a reasonable list to choose from.
Stipulations:
1) As written, I know this is vulnerable to an SQL injection attack.
2) Server runs PHP 4.4.9 and MySQL 4.0.27-Standard, I can't upgrade it. I've got to prove it works before money will be spent.
3) This is hobby-level stuff, not my day job.
4) Performers often use non-English names or elements in their names, and this has led to typos and duplication by the data entry typists.
I've found a lot of mysqli and PDO answers for this sort of thing, and I'm seeing a lot of things that only half make sense (like link #4 below). I'm working on getting up to speed on these things as I try and fix whats become broken.
Places already looked: