0

As per this post Checking if a field contains a string and the reply from okoboko.

I have created an index on one of my fields within my collection. If I use something like:

db.users.find( { $text: { $search: "son" } } )

The query is fast and it's great however I want to query my index using punctuation (since my text field contains urls). If I wanted to retrieve all documents that are related to stackoverflow, I have tried to use:

for page in myCollection.find( { "$text": { "$search": "\"stackoverflow\.com\"" } } ):
    print (page['_id'])

But this does not work. What is the fastest way of searching a collection for fields which contain a string with punctuation?

I do not get an error but my code gets stuck and does not return anything, as I explore Task Manager I can see python is eating up my memory and MongoDB server is working hard too.

When I use this bit of code, the return is super fast but I want to include .com too.

for page in myCollection.find( { "$text": { "$search": "\".stackoverflow\"" } } ):
    print (page['_id'])

When I use this bit of code, I get a return but its about the same return time as using $regex:

for page in ScrapedPagesCollection.find( { "$text": { "$search": "\"stackoverflow.com\"" } } ):
    print (page['_id'])
Jack
  • 394
  • 1
  • 15

1 Answers1

0

This works for me:

import pymongo

db = pymongo.MongoClient()['mydatabase']
db.mycollection.insert_one( { 'site': 'https://www.stackoverflow.com' } )
db.mycollection.create_index([('site', pymongo.TEXT)])

print(list(db.mycollection.find( { '$text': { '$search': 'stackoverflow.com' } } )))

gives:

[{'_id': ObjectId('5e3584b1534c1043defcd5bb'), 'site': 'https://www.stackoverflow.com'}]
Belly Buster
  • 8,224
  • 2
  • 7
  • 20