4

I am building a "related resources" page that searches our library for papers having to do with a specific aircraft. If a visitor is researching the XV-1, I want to find all papers containing the designation "XV-1" in the title. But I don't want to find any papers on the XV-15. The aircraft designation is a ColdFusion variable, and we are using MySQL.

What I am using now:

WHERE title LIKE <cfqueryparam cfsqltype="cf_sql_varchar" value="%#trim(aircraftDesignation)#%" />

This returns "The XV-1's Rotor System" as expected, but also returns "Using RotCFD to Predict Isolated XV-15 Rotor Performance" which we don't want.

I think that if I could query on the aircraft designation, without allowing for any numeric characters immediately afterwards, but allowing for other characters such as apostrophe or comma, that might work. However, I tried

'%#trim(aircraftDesignation)#[^0-9]%'

and

'%#trim(aircraftDesignation)#[!0-9]%'

neither of which return anything at all. I would greatly appreciate any assistance or nudges in the right direction, if this is at all possible! Thank you very much for your advice.

UPDATE: I have done a poor job explaining. My apologies! We do have a field for Aircraft Designation. The related resources query runs only when a visitor lands on a specific page. For example, if he or she lands on the XV-1 page, Aircraft Designation will have be assigned a value of "XV-1." The query for related resources happens automatically based on that assigned Aircraft Designation (different per page). No user input is taken.

My effort to not include numbers after "XV-1" was an attempt to prevent papers on the XV-15 from showing on the XV-1 page. Papers on the XV-15 would show on the XV-15 page, since "XV-15" is the Aircraft Designation. But papers on the XV-167 (if there was such a thing) would not show on the XV-15 page.

I tagged this as ColdFusion because Aircraft Designation is a CF variable being used in the query. My attempts to use SQL (and my research into using REGEXP are either failing because I seem to be not using the variable correctly. Or in the case of REGEXP, unable to use a variable in conjunction with a regular expression.

I hope this helps clarify! Thank you all so much for your help.

ANOTHER UPDATE: Using REGEXP like this:

WHERE title REGEXP '#trim(aircraftDesignation)#[^0-9]'

is working fine so far! A paper with "XV-1" in the title, plus the "XV-1's" paper, are both showing. But the XV-15 papers are not. All of the other searches I have spot-checked seem to be working as expected.

The full text searching is a great option though and I will definitely look into it further. Thanks everybody for your help and suggestions! I really appreciate it.

daltec
  • 447
  • 1
  • 5
  • 15
  • 2
    You might want to try using MySql's `REGEX` operator instead of `LIKE`. http://dev.mysql.com/doc/refman/5.7/en/regexp.html – beloitdavisja Feb 24 '16 at 04:24
  • 2
    How would you distinguish between XV-1 and XV-15? The latter has a numeric character after the former, so not allowing numeric characters after the query criteria wouldn't make any difference here. I think you need to have a designated column for your aircraft designation and query against that. This isn't really a ColdFusion question. – Gary Stanton Feb 24 '16 at 10:34
  • Your programming efforts would be better spent on your search page. Maybe related selects would be a better option than a text box. – Dan Bracuk Feb 24 '16 at 13:02
  • Thanks everybody. I am researching REGEXP but it seems to be used by itself, not in conjunction with a variable. I need to read more on it, though, so thanks for the tip! Gary and Dan, I added an explanation that might make things a little more clear. I appreciate your help! – daltec Feb 24 '16 at 19:23
  • Does your database include any relationship between aircraft and documents? – Dan Bracuk Feb 24 '16 at 20:27
  • 1
    It is not that you cannot use a variable in conjunction with a regular expression, but that you must be able to transform the variable contents *into* a valid regular expression. For example, you could use [REGEXP with optional word boundaries](http://stackoverflow.com/a/18902555/104223). Granted, technically you could probably achieve the same result with a LIKE match, but .... the sql would almost certainly be more convoluted, possibly less performant. Depending on your needs another option to look into is [full text searching](https://dev.mysql.com/doc/refman/5.7/en/fulltext-search.html). – Leigh Feb 24 '16 at 22:10
  • Hey Dan, no, in fact the document library is in a totally separate database. – daltec Feb 25 '16 at 03:45
  • Hey Leigh, thanks for the advice! I looked into full text searching, and set up a table to experiment. I was getting good results for some aircraft, but those with a "-" in the designation did not return any results. For example, "AH-64D" did not give any results, even though we have a ton of papers with "AH-64D" in the title. But "A129" gave us results, no problem. – daltec Feb 25 '16 at 03:48

1 Answers1

1

I would like to thank beloitdavisja for pointing me in the right direction with his comment above:

"You might want to try using MySql's REGEX operator instead of LIKE. dev.mysql.com/doc/refman/5.7/en/regexp.html – beloitdavisja"

This gave me the results I was looking for, plus a lot of other useful info. Thanks! Per Leigh's point, here is the query I ended up using (cfqueryparam omitted for brevity):

SELECT sku, title, content, fileName
FROM sd_productsearch
WHERE title REGEXP '#trim(aircraftDesignation)#[^0-9]' AND sku NOT LIKE 'v_%' AND status = '1'
ORDER BY title

A little explanation: this query only pulls our technical papers, not our general interest magazine articles (SKU numbers starting with "v_"). It also only shows papers active on our library (status of 1). The REGEXP I used allows for a search of the specific Aircraft Designation for that aircraft's page, but filters out any numbers afterwards. It's similar to what I attempted to do initially, but I was unaware of the REGEXP operator. So thanks again beloitdavisja!

UPDATE: thanks for the reminders, Leigh, of scoping variables and (especially) cfqueryparam. If anybody is wondering how to use it with REGEXP, this worked for me:

WHERE title REGEXP <cfqueryparam cfsqltype="cf_sql_varchar" value="#trim(aircraftDesignation)#[^0-9]">
daltec
  • 447
  • 1
  • 5
  • 15
  • So what SQL did you end up using? Updating your answer to include a snippet of the actual SQL will help the next guy that runs into this problem. – Leigh Feb 25 '16 at 07:51
  • Good point Leigh! I had edited my original question, but you are right, the answer I arrived belongs in the answer! :-) Also, the sd_productsearch table referred to is a new version of our full products table, set up to use full-text searches. Thanks for your suggestion -- I was unfamiliar with full-text searching and word boundaries, as well, and we are going to get some real use out of them in other applications. Thanks very much! – daltec Feb 25 '16 at 16:23
  • 1
    Nice follow-up. Just keep in mind REGEXP has reserved characters, like any language. If `#aircraftDesignation#` contains any of those characters it might break the SQL or produce unexpected results. So you will probably want to add some validation. Also, do not forget to scope your variables AND use cfqueryparam (otherwise the query is is vulnerable to sql injection!). – Leigh Feb 25 '16 at 18:26
  • Never mind... I see you are using cfqueryparam probably just omitted it for brevity :) (Updating the answer to note that) – Leigh Feb 25 '16 at 18:44