-3

I'm trying to look for emphasized words in a text using the following regex

r'\b[A-Z]{2,}\b'

My query on Mongodb is as follows:

db.getCollection('_collection').find({'text': {'$regex': "\b[A-Z]\b"}})

But I'm getting no results while I know there are documents that contain emphasized words in the text.

Emphasized word: "he really LOVES this towel" in this example the "LOVES" is an emphasized word.

Lior Magen
  • 1,533
  • 2
  • 15
  • 33
  • 2
    That is not what the metacharacter `\b` is used for. – styvane Jul 02 '17 at 08:47
  • @S.M.Styvane I'm more baffled at what an :"emphasized word" is. We can kind of guess it meant "bold" ( hence the "b" ), but the OP seems off on their own little trip and isn't really listening. Best just to mark as "unclear what you are asking" anyway. – Neil Lunn Jul 02 '17 at 09:09
  • 1
    BTW, did you try `"\\b[A-Z]{2,}\\b"`? – Wiktor Stribiżew Jul 02 '17 at 09:21
  • Duplicate of [Regular expression for checking if capital letters are found consecutively in a string?](https://stackoverflow.com/questions/4050381/regular-expression-for-checking-if-capital-letters-are-found-consecutively-in-a). Can someone else close please. I already voted as unclear but the OP confirms they are looking for "Capital" letters – Neil Lunn Jul 02 '17 at 10:35
  • @NeilLunn I didn't ask how to find this template using regex, the question is about finding it in Mongodb. In the question I write the template for finding this kind of words. – Lior Magen Jul 02 '17 at 11:08
  • You actually have not said a single true of sensical thing here at all. I'd rather just see the question deleted altogether. – Neil Lunn Jul 02 '17 at 11:11
  • AFAIK, and @WiktorStribiżew commented; using `"\\b[A-Z]{2,}\\b"` instead of `"\b[A-Z]\b"` in your find expression will do all you want ;). – shA.t Jul 02 '17 at 12:12
  • Try `db.getCollection('_collection').find({'text':{'$regex':'\\b[A-Z]{2,}\\b'}})` ;). – shA.t Jul 02 '17 at 12:15

1 Answers1

0

I just realized that the regex format in Mongodb is a bit different than the one in Python.

The query I needed is:

db.getCollection('_collection').find({'text': {'$regex': '[A-Z]{2,}'}}) 

Which looks for emphasized words that contain at least two characters.

The query just for emphasized words without any additional conditions is:

db.getCollection('_collection').find({'text': {'$regex': '[A-Z]'}}) 
Lior Magen
  • 1,533
  • 2
  • 15
  • 33
  • 1
    What are you talking about? Both python and MongoDB use the same `pcre` library! What is an "emphasized word" anyway? – Neil Lunn Jul 02 '17 at 09:04
  • @NeilLunn If you have the sentence - "he really LOVES this product", in this case "LOVES" is an emphasized word. – Lior Magen Jul 02 '17 at 10:26