Note
There's an update below that I think shows a better version of this same idea. But this is where it started.
Original Version
Here's another attempt, building a more flexible solution out of reusable parts.
const intoPairs = (xs) =>
xs .slice (1) .map ((x, i) => [xs[i], x])
const splitAtIndices = (indices, str) =>
intoPairs (indices) .map (([a, b]) => str .slice (a, b))
const alternate = Object.assign((f, g) => (xs, {START, MIDDLE, END} = alternate) =>
xs .map (
(x, i, a, pos = i == 0 ? START : i == a.length - 1 ? END : MIDDLE) =>
i % 2 == 0 ? f (x, pos) : g (x, pos)
),
{START: {}, MIDDLE: {}, END: {}}
)
const wrap = (before, after) => (s) => `${before}${s}${after}`
const truncate = (count) => (s, pos) =>
pos == alternate.START
? s .length <= count ? s : '... ' + s .slice (-count)
: pos == alternate.END
? s .length <= count ? s : s .slice (0, count) + ' ...'
: // alternate.MIDDLE
s .length <= (2 * count) ? s : s .slice (0, count) + ' ... ' + s .slice (-count)
const highlighter = (f, g) => (ranges, str, flip = ranges[0][0] == 0) =>
alternate (flip ? g : f, flip ? f : g) (
splitAtIndices ([...(flip ? [] : [0]), ...ranges .flat() .sort((a, b) => a - b), str.length], str)
) .join ('')
const highlight = highlighter (truncate (20), wrap('<u>', '</u>'))
#output {padding: 0 1em;} #input {padding: .5em 1em 0;} textarea {width: 50%; height: 3em;} button, input {vertical-align: top; margin-left: 1em;}
<div id="input"> <textarea id="string">Some text that doesn't include the main thing, the main thing is the result, you may know I meant that</textarea> <input type="text" id="indices" value="[23, 30], [69, 75]"/> <button id="run">Highlight</button></div><h4>Output</h4><div id="output"></div> <script>document.getElementById('run').onclick = (evt) => { const str = document.getElementById('string').value; const idxString = document.getElementById('indices').value; const idxs = JSON.parse(`[${idxString}]`); const result = highlight(idxs, str); console.clear(); document.getElementById('output').innerHTML = ''; setTimeout(() => { console.log(result); document.getElementById('output').innerHTML = result; }, 300)}</script>
This involves the helper functions intoPairs
, splitAtIndices
alternate
, wrap
and truncate
. I think they are best show by examples:
intoPairs (['a', 'b', 'c', 'd']) //=> [['a', 'b'], ['b', 'c'], ['c', 'd']]
splitAtIndices ([0, 3, 7, 15], 'abcdefghijklmno') //=> ["abc", "defg", "hijklmno"]
// ^ ^ ^ ^ `---' `----' `--------'
// | | | | | | |
// 0 3 7 15 0 - 3 4 - 7 8 - 15
alternate (f, g) ([a, b, c, d, e, ...]) //=> [f(a), g(b), f(c), g(d), f(e), ...]
wrap ('<div>', '</div>') ('foo bar baz') //=> '<div>foo bar baz</div>
//chars---+ input---+ position---+ output--+
// | | | |
// V V V V
truncate (10) ('abcdefghijklmnop', ~START~) //=> '... ghijklmnop'
truncate (10) ('abcdefghijklmnop', ~END~) //=> 'abcdefghij ...'
truncate (10) ('abcdefghijklmnop', ~MIDDLE~) //=> 'abcdefghijklmnop'
truncate (10) ('abcdefghijklmnopqrstuvwxyz', ~MIDDLE~) //=> 'abcdefghij ... qrstuvwxyz'
All of these are potentially reusable, and I personally have intoPairs
and wrap
in my general utility library.
truncate
is the only complex one, and that is mostly because it does triple duty, handling the first string, the last string, and all the others in three distinct manners. You first supply a count
and the you give a string as well as the position (START
, MIDDLE
, END
, stored as properties of alternate
.) For the first string, it includes an ellipsis (...
) and the last count
characters. For the last one, it includes the first count
characters and an ellipsis. For the middle ones, if the length is shorter than double count
, it returns the whole thing; otherwise it includes the first count
characters, an ellipsis and the last count
characters. This behavior might be different from what you want; if so,
The main function is highlighter
. It accepts two functions. The first one is how you want to handle the non-highlighted sections. The second is for the highlighted ones. It returns the style function you were looking for, one that accepts an array of two-element arrays of numbers (the ranges) and your input string, returning a string with the highlighted ranges and the non-highlighted ranges.
We use it to generate the highlight
function by passing it truncate (20)
and wrap('<u>', '</u>')
.
The intermediate forms might make it clearer what's going on.
We start with these indices:
[[23, 30], [69, 75]]]
and our 103-character string,
"Some text that doesn't include the main thing, the main thing is the result, you may know I meant that"
First we flatten the ranges, prepending a zero if the first range doesn't start there and appending the last index of the string, to get this:
[0, 23, 30, 69, 75, 102]
We pass that to splitAtIndices, along with our string, to get
[
"Some text that doesn't ",
"include",
" the main thing, the main thing is the ",
"result",
", you may know I meant that"
]
Then we map the appropriate functions over each of these strings to get
[
"... e text that doesn't ",
"<u>include</u>",
" the main thing, the main thing is the ",
"<u>result</u>",
", you may know I mea ..."
]
and join those together to get our final results:
"... e text that doesn't <ul>include</ul> the main thing, the main thing is the <ul>result</ul>, you may know I mea ..."
I like the flexibility this offers. It's easy to alter the highlight strategy as well as how you handle the unhighlighted parts -- just pass a different function to highlighter
. It's also a useful breakdown of the work into reusable parts.
But there are two things I don't like.
First, I'm not thrilled with the handling of middle unhighlighted sections. Of course it's easy to change; but I don't know what would be appropriate. You might, for instance, want to change the doubling applied to the count there. Or you might have an entirely different idea.
Second, truncate
is dependent upon alternate
. We have to somehow pass signals from alternate
to the two functions supplied to it to let them know where we are. My first pass involved passing the index and the entire array (the Array.prototype.map
signature) to those functions. But that felt too coupled. We could make START
, MIDDLE
, and END
into module-local properties, but then alternate
and truncate
would not be reusable. I'm not going to go back and try it now, but I think a better solution might be to pass four functions to highlighter
: the function for the highlighted sections, and one each for start, middle, and end positions of the non-highlighted ones.
Update
I did go ahead and try that alternative I mentioned, and I think this version is cleaner, with all the complexity located in the single function highlighter
:
const intoPairs = (xs) =>
xs .slice (1) .map ((x, i) => [xs[i], x])
const splitAtIndices = (indices, str) =>
intoPairs (indices) .map (([a, b]) => str .slice (a, b))
const wrap = (before, after) => (s) => `${before}${s}${after}`
const truncateStart = (count) => (s) =>
s .length <= count ? s : '... ' + s .slice (-count)
const truncateMiddle = (count) => (s) =>
s .length <= (2 * count) ? s : s .slice (0, count) + ' ... ' + s .slice (-count)
const truncateEnd = (count) => (s) =>
s .length <= count ? s : s .slice (0, count) + ' ...'
const highlighter = (highlight, start, middle, end) =>
(ranges, str, flip = ranges[0][0] == 0) =>
splitAtIndices ([...(flip ? [] : [0]), ...ranges .flat() .sort((a, b) => a - b), str.length], str)
.map (
(s, i, a) =>
(flip
? (i % 2 == 0 ? highlight : i == a.length - 1 ? end : middle)
: (i == 0 ? start : i % 2 == 1 ? highlight : i == a.length - 1 ? end : middle)
) (s)
) .join ('')
const highlight = highlighter (
wrap('<u>', '</u>'),
truncateStart(20),
truncateMiddle(20),
truncateEnd(20)
)
console .log (
highlight (
[[23, 30], [69, 75]],
"Some text that doesn't include the main thing, the main thing is the result, you may know I meant that"
)
)
console .log (
highlight (
[[23, 30], [86, 92]],
"Some text that doesn't include the main thing, because you see, the main thing is the result, you may know I meant that"
)
)
There is some real complexity built into highlighter
, but I think it's fairly intrinsic to the problem. On each iteration, we have to choose one of our four functions based on the index, the length of the array, and whether the first range started at zero. This bit here simply chooses the function based on all that:
(flip
? (i % 2 == 0 ? highlight : i == a.length - 1 ? end : middle)
: (i == 0 ? start : i % 2 == 1 ? highlight : i == a.length - 1 ? end : middle)
)
where the flip
boolean simply reports whether the first range starts at 0
, a
is the array of substrings to handle., and i
is the current index in the array. If you see a cleaner way of choosing the function, I'd love to know.
If we wanted to write a gloss for this sort of highlighting, we could easily write
const truncatingHighlighter = (count, start, end) =>
highlighter (
wrapp(start, end),
truncateStart(count),
truncateMiddle(count),
truncateEnd(count)
)
const highlight = truncatingHighlighter (20, '<u>', '</u>')
I definitely think this is a superior solution.