1

I am building a basic search engine using mongodb, I have verified that the basic query work in the mongo shell. I am not quite understanding how this can be translated into PHP though.

Spaces in the input string signify 'and' operators and | or pipe characters are the 'or' operators. The input query changes , but could be something along these lines (minus the quotes!):

'o g|ra'

That would be equivalent to writing:

(o&&g)||(ra)

Basic mongo query (please note I am not trying to translate this exact query everytime, I need it to be flexible in terms of the number of $ands and $ors). Have tested this and it works fine:

db.scores.find({$or:[{Title:/o/i, Title: /g/i},{Title:/ra/i}])

The code that I have produced in PHP is this:

if(strstr($textInput, '|') != FALSE)
{
    foreach($orArray as $item)
    {
        $itemMod = explode( " " , $item);
        array_push($stringArray, $itemMod);
    }

    $masterAndQueryStack = array();

    foreach ($stringArray as $varg)
    {
            $multiAndQuerySet = array();

            foreach ($varg as $obj)
            {
                $searchText = '/'. $obj .'/i';
                $regexObj = new MongoRegex( $searchText ) ; 
                $singleQuery = array('Title' => $regexObj); 
                array_push($multiAndQuerySet , $singleQuery);
            }
            array_push($masterAndQueryStack , $multiAndQuerySet);

    }

    $orAndQueryStack =  array('$or' => $masterAndQueryStack);
    return $orAndQueryStack ;
}

This is the query that has been returned by the PHP code, as you can see the and terms have been put in an array. I can't see any way of storing these without pushing them to an array, however it seems that mongodb's $or does not like accepting an array, I'm just not sure how to re-work the search algorithm to account for this.

Array 
(
    [$or] => Array
    (
        [0] => Array 
        ( 
            [0] => Array ( [Title] => MongoRegex Object ( [regex] => o [flags] => i ) )
            [1] => Array ( [Title] => MongoRegex Object ( [regex] => g [flags] => i ) ) 
        )
        [1] => Array 
        ( 
            [0] => Array ( [Title] => MongoRegex Object ( [regex] => ra [flags] => i ) ) 
        ) 
    ) 
)
Community
  • 1
  • 1
jjcohen
  • 59
  • 5
  • You can flip the first $and regex to use groups to detect an $and or you can use the actual $and operator. – Sammaye Aug 08 '12 at 11:09

2 Answers2

2

To explain my comment further I will tell you about the $and operator: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24and

You can nest this within your first $or making:

Array 
(
    [$or] => Array
    (
        [0] => Array
        (
            [$and] => Array 
            ( 
            [0] => Array ( [Title] => MongoRegex Object ( [regex] => o [flags] => i ) )
            [1] => Array ( [Title] => MongoRegex Object ( [regex] => g [flags] => i ) ) 
            )
        )
        [1] => Array 
        ( 
            [Title] => MongoRegex Object ( [regex] => ra [flags] => i ) 
        ) 
    ) 
)

Like that. You can also perform $and queries in Regex, some info here about regex syntax: http://www.regular-expressions.info/refadv.html

Sammaye
  • 43,242
  • 7
  • 104
  • 146
1

Not sure what sort of corpus of data you have to search, but there are some significant limitations with your current approach:

All of the above caveats may be fine if you don't have a large data set to search.

Some more performant alternatives would be:

Community
  • 1
  • 1
Stennie
  • 63,885
  • 14
  • 149
  • 175
  • Many thanks for the extra thoughts. It's not a big corpus by any means, and the application seems to be performing pretty quickly. Unfortunately case sensitivity and generating tags are not really options for this implementation. With regards to relevance and ordering I was planning on doing that client side (using javascript) so the user could actually specify how they see the results, this offers far more flexibility (for the user) over returning . – jjcohen Aug 10 '12 at 09:44