Short Version
I would like to efficiently perform a full text search within an arbitrary set of objects in my database. All objects will be indexed in a search engine.
My idea
I'm planning on making this a 2 part operation. First the search engine would be queried for a weighted/sorted set of ids matching the full text search. This set of ids would them be filtered, removing any ids not in the user's original set.
Is there a better way to do this? If not, can you provide any advice on doing this efficiently?
Long Version
I am in the planning phase of building a web application that will allow users to visualize sets of highly linked data and manipulate these visualizations to derive sets of interesting vertices for further analysis. The filtering actions performed by the user through the gui will be complex and very difficult to express as index-able quantities.
I would like to allow the users to perform full text search for results within these data sets. Looking at what Google does for searching within a result set, their approach of simply appending an earlier search query to a new query to enable "search within" may not be feasible for my data.
The accepted answer to this question promotes the idea of using database operations to filter results coming from a search engine.
As part of the solution, I am also considering having the front end switch over to using lunr when the set of vertices the user wants to search within gets small enough for the front end to handle. Figuring out what this limit is will take some testing, but I doubt it will be several thousand, so the need for a server side solution remains.
Environment Details
I'm running python 2.7 on appengine.
In this application, I expect the initial result sets (that will be searched within) to contain between 10 and 2000 vertices. The total number of vertices in the entire database could be a couple orders of magnitude larger.