-1

There is a long text, in which there are lots of types of white spaces. In order to get an array of words out of it I do var words = whole_text.split(/\s+/);

Now I want to get the text after words[124], including all the original white spaces. How would I do that? The reason I do this is to get a character position in my text after a click on the character. Would be happy to hear alternative approaches as well.

miha64
  • 77
  • 1
  • 7
  • Is there any chance of the input text *starting* with whitespace? – CertainPerformance Mar 24 '19 at 20:49
  • No. The text always starts with a letter. – miha64 Mar 24 '19 at 20:51
  • Can you please explain what you mean by `get a character position in my text after a click on the character` – ManavM Mar 24 '19 at 21:01
  • Click on the letter -> get the position of the letter relative to the text. – miha64 Mar 24 '19 at 21:08
  • I meant how are you registering clicks? What is the DOM like? Is it a single `

    ` tag with all the text? Is every word in a ``? You seem to be able to find the `chosen word` which implies you are able to distinguish between clicks on different words. A broader picture will help us give you a better solution.

    – ManavM Mar 24 '19 at 21:17
  • It is basically a book. Every word in a . Every paragraph is in

    . All the headlines are also in

    . Basically only spans and ps.

    – miha64 Mar 24 '19 at 21:21
  • I posted an answer that should be able to solve the underlying task. Not sure if it's more efficient than regexing the whole thing. – ManavM Mar 24 '19 at 21:58

4 Answers4

1

From the word index you want, one option would be to use a regular expression that repeats that number of repetitions of \S+\s, which will match up to the word index you're interested in. For example:

const str = 'foo     bar baz buzz foooo barrr bazzz   buzzzz';
const words = str.split(/\s+/);

console.log(words[2]);
// to get the text after the word at [2]:
const re = new RegExp(String.raw`(?:\S+\s+){3}`);
const textAfterWords2 = str.replace(re, '');
console.log('text after ' + words[2] + ' is:');
console.log(textAfterWords2);

console.log(words[5]);
// to get the text after the word at [5]:
const re2 = new RegExp(String.raw`(?:\S+\s+){6}`);
const textAfterWords5 = str.replace(re2, '');
console.log('text after ' + words[5] + ' is:');
console.log(textAfterWords5);

// to get just the index in the original string:
const indexOfBarrrEnd = str.match(re2)[0].length;
console.log(indexOfBarrrEnd , str.slice(indexOfBarrrEnd ));
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
0

If you need the position of the word you can use findIndex:

let words = whole_text.trim().split(/\s+/);
let position = words.findIndex(word => word === chosenWord);

I used trim to cut extra white space.

And then you slice the original text:

console.log(whole_text.slice(position));
  • This will fail if the string has the selected word twice. For example if the second instance of 'bar' was clicked in the string `bar foo bar foofoo`, then your script will return `foo bar foofoo`, which does not work as intended. – ManavM Mar 24 '19 at 21:03
  • @ManavM Yes, you are absolutely right! The best way I could think of right now is if the words are displayed in separate `
    `s or ``s, their indexes can also be attached to the elements as an attribute. And when clicked we can get the right index.
    – Diyorbek Sadullaev Mar 24 '19 at 21:13
0

Something like this works, also handles repeats:

var text="hello  world hello           world lorem ipsum";
var words=text.split(/\s+/);
for(var i=0;i<words.length;i++)
  console.log(i,"*"+text.match(words.slice(0,i).join("\\s+")+"(.*)")[1]+"*");

Personally I would not force regexps on this task though.

tevemadar
  • 12,389
  • 3
  • 21
  • 49
0

Well let's assume that the parent that holds all the words has the id parent. Then you can do something like

const parent = document.querySelector("#parent");
parent.addEventListener("click", handleClick, false);

const handleClick = (e) => {
    if (e.target !== e.currentTarget) {
        // target is the element that triggered the event currentTarget is the element that the event listener is attached to.
        const i = Array.prototype.indexOf.call(parent.childNodes, e.target);
        console.log(i);
    }
    e.stopPropagation();
}

This should give you which word is clicked. If you now want to find out which character was clicked in this word, you can follow the answer in Determine the position index of a character within an HTML element when clicked

ManavM
  • 2,918
  • 2
  • 20
  • 33