2

I have some text.

text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do 
       eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim 
       ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut 
       aliquip ex ea commodo consequat.'

How could I split this up based on the length of another array.

array = 'sed do'

I tried:

alength = array.split(" ").length;
array2 = candidate.match('\/((?:(?:\\S+\\s){'+alength+'})|(?:.+)(?=\\n|$))\/g');

Which returns null.

What I was hoping to get was:

array 2 = 'Lorem ipsum','dolor sit',...'commodo consequat'

Is there another String method I could use maybe?

rlu7732
  • 23
  • 4
  • 1
    *"How could I split this up based on the length of another array."* - You mean, based on the number of words in another string? (Your other `array` variable is actually just a string, not an array, which I guess you know given you are treating it as a string despite its name and your description...) – nnnnnn Mar 20 '17 at 02:40

1 Answers1

1

I like using regular expressions, but things can start to get confusing once you start building a regex dynamically, so I'd consider an alternative approach. E.g., you could just split the original string into individual words, then group them up as needed. Easy to understand and maintain:

function getPhrases(text, wordsPerPhrase) {
  var words = text.split(/\s+/)
  var result = []
  for (var i = 0; i < words.length; i += wordsPerPhrase) {
    result.push(words.slice(i, i + wordsPerPhrase).join(" "))
  }
  return result
}

text = `Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do 
       eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim 
       ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut 
       aliquip ex ea commodo consequat.`
       
console.log(getPhrases(text, 9))
console.log(getPhrases(text, 5))
console.log(getPhrases(text, 2))

Note: if you don't want punctuation in your output, you can add something like this as the first line of your function:

text = text.replace(/[^a-z'\s]/gi,'')
nnnnnn
  • 147,572
  • 30
  • 200
  • 241
  • One thing to note is that a regex will typically be slower for doing simple split operations like this. Additionally, if you're going to use the expression more than once, you should precompile it. – Soviut Mar 20 '17 at 02:56
  • @Soviut - Well, in the input shown there appeared to be both spaces and linebreaks, hence `/\s+/` rather than just a non-regex `" "`. The expression is only used more than once in the sense of being used once each time the function is called; I don't think it takes long to compile `/\s+/`... – nnnnnn Mar 20 '17 at 03:00
  • @torazaburo - Isn't that exactly the approach I took in this answer? – nnnnnn Mar 20 '17 at 03:21