1

I have an input field where the user can search for just a word or a sentence.

Let's say in a block of text:

Paragraphs are the building blocks of papers. Many students define paragraphs in terms of length: a paragraph is a group of at least five sentences, a paragraph is half a page long, etc. In reality, though, the unity and coherence of ideas among sentences is what constitutes a paragraph. A paragraph is defined as “a group of sentences or a single sentence that forms a unit” (Lunsford and Connors 116). Length and appearance do not determine whether a section in a paper is a paragraph. For instance, in some styles of writing, particularly journalistic styles, a paragraph can be just one sentence long. Ultimately, a paragraph is a sentence or group of sentences that support one main idea. In this handout, we will refer to this as the “controlling idea,” because it controls what happens in the rest of the paragraph.

As a user I type in students define paragraph. I want the regex to find students define paragraph as a sentence and students, define, paragraph.

Expected

Paragraphs are the building blocks of papers. Many students define paragraphs in terms of length: a paragraph is a group of at least five sentences, a paragraph is half a page long, etc. In reality, though, the unity and coherence of ideas among sentences is what constitutes a paragraph. A paragraph is defined as “a group of sentences or a single sentence that forms a unit” (Lunsford and Connors 116). Length and appearance do not determine whether a section in a paper is a paragraph. For instance, in some styles of writing, particularly journalistic styles, a paragraph can be just one sentence long. Ultimately, a paragraph is a sentence or group of sentences that support one main idea. In this handout, we will refer to this as the “controlling idea,” because it controls what happens in the rest of the paragraph.

So far I have tried using /students(.*?)define?paragraph/gmi and also putting them in individual parathesis. I was told to do more than one regex search but that will cause a long run time. Wondering if there is a way to define the regex search.

Also tried /students(?define)(.*?)paragraph/gmi but this doesn't return individual if there isn't an endpoint to group the match.

Yikern
  • 43
  • 1
  • 8
  • Can you help me understand the question, please. if you are not ranking them just searching for 'student' or 'define' or 'paragraph' will return all sentences that have one, two, or any combination of the three, including when they appear in tandem. Or do you don't want a sentence where, for example, 'students' appears **after** 'define'? i.e. if both are present, 'students' should immediately precede 'define'? – Reza Feb 15 '19 at 04:32
  • How do you expect to match `paragraph` when your query contains `paragraphs`? A more extreme example, how do you expect to match `car` when your query contains `carpet`, or `java` when your query contains `javascript`? ;) – Patrick Roberts Feb 15 '19 at 04:35
  • [This post](/a/49092029/3634538) might help you. – Yom T. Feb 15 '19 at 04:36
  • My apologies, @Reza. I want to match exact sentences and every individual occurrence of the word. I was thinking of my own index recursion function but I want to see if regex can help with that. Cause I tried looping through the sentence using array.prototype.forEach after splitting the sentence up by spaces. – Yikern Feb 15 '19 at 04:42

1 Answers1

1

You will need to construct a regex which contains user's input as it is and in addition to it, split your query by space and have other tokens as alternations so they can be matched individually. Besides that, as I can see you want to match the singular version as well, hence you need to make the last s as optional by changing students to students? or you may need to work on in more based upon different kind of words available in the language. For your given example search query students define paragraphs the regex you need to search will be this,

students? define paragraphs?|students?|define|paragraphs?

Demo

Here is a function which you can use to generate the regex like I mentioned above,

function createRegex(str) {
  var newStr = str.replace(/s(?:( +)|$)/g,'s?$1');
  var arr = str.split(/ +/g);
  for(s of arr) {
    newStr = newStr.concat('|').concat(s.replace(/s$/g,'s?'));    
  }
  return newStr;
}

console.log(createRegex('students    define paragraphs'));
Pushpesh Kumar Rajwanshi
  • 18,127
  • 2
  • 19
  • 36