1

I'm attempting to index around 30,000 database records in a single collection and per my requirements I need to be able to include a list of items in a single custom field - and use that in my search.

Here's an example of my index:

<cfindex collection = "myCollection" 
             action = "refresh"
             type = "custom"
             query = "Local.myQuery" 
             key = "ID" 
             title="Title"
             applications_s="A_Comma_Separated_List"
             body = "a_field,a_nother_field">

In this example, applications_s is a dynamic custom field (introduced in CF10) containing a list of application IDs. An example of content for this field would be:

T1,T2,B4,G1

This all indexes splendidly, however I've been unable to figure out how to search, using a single item in the applications list as criteria.

So, I'd like to be able to do this:

<cfsearch name="Local.qSearch"
    collection="myCollection"
    criteria="test AND applications_s:T1">

This should return all records that contain the word 'test' in the body, and also contain 'T1' in the applications field. However, I can't find a criteria syntax that will treat the contents of the custom field as a comma separated list... it seems to only work as a string. Therefore my example record wouldn't be returned unless I include a wildcard - which could cause problems with extra records being returned by mistake.

Is there any way to explicitly specify that my custom field is a list and should contain my specified value?

Gary Stanton
  • 1,435
  • 12
  • 28
  • Let me preface my comments by saying that I have not used `cfindex` in the way that you are asking about. After reading the [docs here](http://help.adobe.com/en_US/ColdFusion/10.0/CFMLRef/WSc3ff6d0ea77859461172e0811cbec22c24-7d04.html) I have a couple of questions/comments. First, can you name the custom attributes anything you want or do they need to be `custom1`, `custom2`, `custom3`, `custom4`? Second, the docs say that the custom attributes should specify a query column name when the `type=custom` like your example. Are your application IDs contained in a column of your database? – Miguel-F Jun 04 '13 at 18:30
  • Hi Miguel, As of CF10 you can dynamically add custom fields and name them whatever you like - it's the same as using 'custom1, custom2' etc. I'd be happy to use customX if it made a difference. Secondly, yes the application IDs come back in a database column. The cfindex example works exactly as I'd expect, I'd just like to find a syntax that would allow me to match the custom column, treated as a list. – Gary Stanton Jun 04 '13 at 21:56
  • I'm beginning to think that this simply cannot be done. I can find no information about *searching* a custom field. Seems I can look for an exact match, or a match using a wildcard, but since I can't have a wildcard as the first character, this is pretty useless to me. This seems like a *really* obvious thing to want to do... am I missing something or is it seriously not supported?? – Gary Stanton Jun 05 '13 at 09:01
  • Just for reference, the docs regarding dynamically named custom fields are here: http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSe61e35da8d318518-1acb57941353e8b4f85-7ff9.html - For some reason, this information is not included in the main CF10 CFIndex docs page. – Gary Stanton Jun 05 '13 at 10:37
  • And with further experimentation, it seems you *can* search inside a custom field, but only those of type 'text'. Unfortunately, I still can't find a way to treat the data in that field as a list, and so a search for 'T2' will also match 'T2B'. I think this may be as good as it gets. – Gary Stanton Jun 05 '13 at 10:51
  • Thanks for sharing that documentation link, I did not see that before. According to that page it appears as though the custom fields are limited to the supported types listed. And a list is not one of the supported types. Following the naming convention makes your field `applications_s` a string which is how you say it seems to be working. After reading the docs it seems to me as though these custom fields are meant to store information about the collection itself, and not additional information about each of the records in the index. What is your intention with the application IDs? – Miguel-F Jun 05 '13 at 12:43
  • I just stumbled upon this page: [Search limitations](http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSe61e35da8d318518-1acb57941353e8b4f85-7ff8.html). I think it explains the behavior you are seeing. It further explains the `text` type and appears as though that might work for you. You mentioned doing some testing with the `text` type. The docs mention the values in this type of field are tokenized. Did you try searching this type of field for your unique IDs? Wildcards should not be needed for the tokenized values. Add values delimited by spaces, not commas, and see if this works. – Miguel-F Jun 05 '13 at 12:51
  • I did try using the text field type, and it is more searchable (as opposed to matches) but it still doesn't fit my needs... A search for 'T1' in that field will still return rows containing 'T1B' for instance... I think the bottom line is its not possible and I need to find another way to achieve my goal... Such is life. – Gary Stanton Jun 05 '13 at 21:41

1 Answers1

3

I managed to get the following to work on CF9.0.1. Although the MYCUSTOMNAME_TYPE (e.g. applications_s) fields are CF10-only, I was able to use the custom1 field and specify it as a "string" type by editing the collection's schema.xml and restarting Solr. You shouldn't have to on CF10.

1) In the query you're indexing, add TWO commas to the beginning of the application list column, and ONE at the end, so an example row would look like:

,,T1,T1B,T2,B4,G1,

You could do this either in your SQL using concatenation (preferable), or by post-processing the query result with Query-of-Queries, or QueryNew() and looping over the query to build a copy.

2) Index the query with cfindex as in your question, using applications_s to ensure the field is a string type, not text. We don't want the list to be "tokenised" as words. The commas are critical and we don't want them to be ignored.

3) In your cfsearch pad the criteria as follows:

<cfset searchString= "test">
<cfset applicationFilter = "T1">
<cfsearch name="Local.qSearch"
    collection="myCollection"
    criteria="#searchString# AND applications_s:,*,#applicationFilter#,*">

Note there are 3 commas and 2 wildcard asterisks altogether. The first comma is there because you cannot start a Solr query with a wildcard. The second and third commas ensure that the wildcard search for T1 does not match T1B.

CfSimplicity
  • 2,338
  • 15
  • 17
  • Ooooh... there's a thought - so you're just adding the extra comma to get around the wildcard limitation and then including the delimiter in the search... I think that might just work. I won't get a chance to test that 'til next week now, but I'll let you know. – Gary Stanton Jun 06 '13 at 13:10
  • Yes it's a hack but it seems to work on CF9. I don't have access to CF10, but do let us know. It is important that the field type is "string" and not "text". If it's "text" then all delimiters are ignored whereas you need them to ensure the match is exact. – CfSimplicity Jun 06 '13 at 14:52
  • Seems to work well on CF10 too. Thanks for the help! As an aside, I noticed issues when inserting values that contain special characters or nulls into custom fields. The error doesn't give you any info whatsoever about the offending row, but worth keeping in mind that the custom fields should be pretty heavily sanitised. – Gary Stanton Jun 12 '13 at 10:18