22

I'm leveraging a virtualized list (react-virtualized) where the heights of my list items are required and could vary greatly. Due to large variations, any height estimation I give the library yields a poor experience.

The usual method for height calculation goes something like this:

const containerStyle = {
  display: "inline-block",
  position: "absolute",
  visibility: "hidden",
  zIndex: -1,
};

export const measureText = (text) => {
  const container = document.createElement("div");
  container.style = containerStyle;

  container.appendChild(text);

  document.body.appendChild(container);

  const height = container.clientHeight;
  const width = container.clientWidth;

  container.parentNode.removeChild(container);
  return { height, width };
};

Unfortunately, when you're dealing with extremely large lists with items of varying sizes, this isn't performant. While a cache may be leveraged, even that doesn't work out so well when you need to know the total height (height of all items combined) at the very beginning.

A second solution often leveraged is through HTML canvas' measureText. The performance is akin to the above DOM manipulation.

In my case, I know the following:

  • Container Width
  • Font
  • Font size
  • All padding
  • All margins
  • Any and all other styling like line-height

What I'm looking for is a mathematical solution that can compute the height (or an extremely close estimate) such that I don't have to rely on any DOM manipulation and I can get the height whenever I please.

I imagine it goes something like this:

const measureText = (text, options) => {
  const { width, font, fontSize, padding, margins, borders, lineHeight } = options;

  // Assume this magical function exists
  // This all depends on width, stying and font information
  const numberOfLines = calculateLines(text, options);

  const contentHeight = numberOfLines * lineHeight;

  const borderHeight = borders.width * 2 // (this is all pseudo-code... but somehow get the pixel thickness. 

  const marginsHeight = margins.top + margins.bottom
  const paddingHeight = padding.top + padding.bottom

  return marginsHeight + paddingHeight + borderHeight + contentHeight;
}

In the above, we're missing the calculateLines function, which seems like the brunt of the work. How would one move forward on that front? Would I need to do some pre-processing for figuring out character widths? Since I know the font I'm using, this shouldn't be too big an issue, right?

Do browser concerns exist? How might the calculation vary on each browser?

Are there any other parameters to consider? For example, if the user has some system setting that enlarges text for them (accessibility), does the browser tell me this through any usable data?

I understand rendering to the DOM is the simplest approach, but I'm willing to put the effort into a formulaic solution even if that means every time I change margins, etc. I need to ensure the inputs to the function are updated.

Update: This may help on the path towards finding character width: Static character width map calibrated via SVG bounding box. The following has more information: Demo and details. Credits go to Toph

Update 2: Through the use of monospaced typefaces, the width calculation becomes even more simplified as you only need to measure the width of one character. Surprisingly, there are some very nice and popular fonts like Menlo and Monaco on the list.

Big Update 3: It was quite an all-nighter, but through inspiration via the SVG method in update 1, I came up with something that has been working fantastically to calculate the number of lines. Unfortunately, I've seen that 1% of the time it is off by 1 line. The following is roughly the code:

const wordWidths = {} as { [word: string]: number };

const xmlsx = const xmlsn = "http://www.w3.org/2000/svg";

const svg = document.createElementNS(xmlsn, "svg");
const text = document.createElementNS(xmlsn, "text");
const spaceText = document.createElementNS(xmlsn, "text");
svg.appendChild(text);
svg.appendChild(spaceText);

document.body.appendChild(svg);

// Convert style objects like { backgroundColor: "red" } to "background-color: red;" strings for HTML
const styleString = (object: any) => {
  return Object.keys(object).reduce((prev, curr) => {
    return `${(prev += curr
      .split(/(?=[A-Z])/)
      .join("-")
      .toLowerCase())}:${object[curr]};`;
  }, "");
};

const getWordWidth = (character: string, style: any) => {
  const cachedWidth = wordWidths[character];
  if (cachedWidth) return cachedWidth;

  let width;

  // edge case: a naked space (charCode 32) takes up no space, so we need
  // to handle it differently. Wrap it between two letters, then subtract those
  // two letters from the total width.
  if (character === " ") {
    const textNode = document.createTextNode("t t");
    spaceText.appendChild(textNode);
    spaceText.setAttribute("style", styleString(style));
    width = spaceText.getBoundingClientRect().width;
    width -= 2 * getWordWidth("t", style);
    wordWidths[" "] = width;
    spaceText.removeChild(textNode);
  } else {
    const textNode = document.createTextNode(character);
    text.appendChild(textNode);
    text.setAttribute("style", styleString(style));
    width = text.getBoundingClientRect().width;
    wordWidths[character] = width;
    text.removeChild(textNode);
  }

  return width;
};

const getNumberOfLines = (text: string, maxWidth: number, style: any) => {
  let numberOfLines = 1;

  // In my use-case, I trim all white-space and don't allow multiple spaces in a row
  // It also simplifies this logic. Though, for now this logic does not handle
  // new-lines
  const words = text.replace(/\s+/g, " ").trim().split(" ");
  const spaceWidth = getWordWidth(" ", style);

  let lineWidth = 0;
  const wordsLength = words.length;

  for (let i = 0; i < wordsLength; i++) {
    const wordWidth = getWordWidth(words[i], style);

    if (lineWidth + wordWidth > maxWidth) {
      /**
       * If the line has no other words (lineWidth === 0),
       * then this word will overflow the line indefinitely.
       * Browsers will not push the text to the next line. This is intuitive.
       *
       * Hence, we only move to the next line if this line already has
       * a word (lineWidth !== 0)
       */
      if (lineWidth !== 0) {
        numberOfLines += 1;
      }

      lineWidth = wordWidth + spaceWidth;
      continue;
    }

    lineWidth += wordWidth + spaceWidth;
  }

  return numberOfLines;
};

Originally, I did this character-by-character, but due to kernings and how they affect groups of letters, going word by word is more accurate. It's also important to note that though style is leveraged, the padding must be accounted for in the maxWidth parameter. CSS Padding won't have any effect on the SVG text element. It handles the width-adjusting style letter-spacing decently (it's not perfect and I'm not sure why).

As for internationalization, it seemed to work just as well as it did with english except for when I got into Chinese. I don't know Chinese, but it seems to follow different rules for overflowing into new lines and this doesn't account for those rules.

Unfortunately, like I said earlier, I have noticed that this is off-by-one now and then. Though this is uncommon, it is not ideal. I'm trying to figure out what is causing the tiny discrepancies.

The test data I'm working with is randomly generated and is anywhere from 4~80 lines (and I generate 100 at a time).

Update 4: I don't think I have any negative results anymore. The change is subtle but important: instead of getNumberOfLines(text, width, styles), you need to use getNumberOfLines(text, Math.floor(width), styles) and make sure Math.floor(width) is the width being used in the DOM as well. Browsers are inconsistent and handle decimal pixels differently. If we force the width to be an integer, then we don't have to worry about it.

David
  • 7,028
  • 10
  • 48
  • 95
  • I don't think I've ever seen a decent implementation that doesn't use a hidden DOM element. Even those are usually still "good best guess" and not perfect. Do share if anyone does find one though. – user120242 Jun 16 '20 at 04:41
  • @user120242 me either. I'm currently fiddling around with my own width calculator. Will report results. – David Jun 16 '20 at 04:54
  • 1
    @user120242 I edited with an update. While technically it is on the DOM, I must say... the SVG method is extremely performant. Don't even notice a blip and I'm dealing with a large data set. – David Jun 16 '20 at 16:04
  • 6
    What about Z̷̧̢̩̫̟͖̟͇͙̫̟͚̦̓͌̍̐̌̊̓ä̴̭̼̹̫͎͕̲͙͈͊̌̈̕̕͜ͅl̸̻̹̦̬͕͍͉͗̓̌͐̄̃̎͂̈̄̚͘͝͠g̴̹̽̆͌͋͗̏̌̀͆̆̕ŏ̸̱͎͕̥̹͔̱̺̗̽̅̂̀̆͐̀̚͜ͅ? – Kaiido Jun 17 '20 at 06:22
  • @Kaiido I don't think anything would handle that overflow well - testing on Chrome, it doesn't acknowledge that text or accommodate for it's height in any way. – David Jun 17 '20 at 09:30
  • The canvas method and measureText. Doesn't really apply to this thread though. – user120242 Jun 19 '20 at 20:06
  • So at the end (update 4) it seems you was able to get your result, isn't it? – Daniele Ricci Jun 19 '20 at 22:42
  • @DanieleRicci potentially, yes. I'm not confident it works with `letter-spacing` CSS and I need to do more extensive testing. Also... I need to figure out why some languages like Chinese aren't working well. – David Jun 20 '20 at 03:19

2 Answers2

2

IMHO the core of this question is in these few words:

Unfortunately, when you're dealing with extremely large lists with items of varying sizes, this isn't performant. While a cache may be leveraged, even that doesn't work out so well when you need to know the total height (height of all items combined) at the very beginning.

This strongly contrasts with JavaScript’s nature and philosophy: joining "extremely large lists" and "at the very beginning" are some things which don't work in JavaScript.

Probably you can achieve better results with less effort if you focus on what makes you say "at the very beginning" rather than seeking the actual answer to this question. Regardless of how performant the solution you can find is, when the "extremely large lists" continue to grow, your solution will unavoidably cause a UI block.

This is only my two cents.

greg-tumolo
  • 698
  • 1
  • 7
  • 30
Daniele Ricci
  • 15,422
  • 1
  • 27
  • 55
1

I found Measure text algorithm which is to approximate the width of strings without touching the DOM.

I modified it a little to calculate the number of lines (where you are stuck).

You can calculate the number of lines like below:

/**
 * @param text : <string> - The text to be rendered.
 * @param containerWidth : <number> - Width of the container where dom will be rendered. 
 * @param fontSize : <number> - Font size of DOM text
**/

function calculateLines(text, containerWidth, fontSize = 14) {
  let lines = 1;  // Initiating number of lines with 1

// widths & avg value based on `Helvetica` font.
  const widths = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.278125,0.278125,0.35625,0.55625,0.55625,0.890625,0.6671875,0.1921875,0.334375,0.334375,0.390625,0.584375,0.278125,0.334375,0.278125,0.303125,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.278125,0.278125,0.5859375,0.584375,0.5859375,0.55625,1.015625,0.6671875,0.6671875,0.7234375,0.7234375,0.6671875,0.6109375,0.778125,0.7234375,0.278125,0.5,0.6671875,0.55625,0.834375,0.7234375,0.778125,0.6671875,0.778125,0.7234375,0.6671875,0.6109375,0.7234375,0.6671875,0.9453125,0.6671875,0.6671875,0.6109375,0.278125,0.35625,0.278125,0.478125,0.55625,0.334375,0.55625,0.55625,0.5,0.55625,0.55625,0.278125,0.55625,0.55625,0.2234375,0.2421875,0.5,0.2234375,0.834375,0.55625,0.55625,0.55625,0.55625,0.334375,0.5,0.278125,0.55625,0.5,0.7234375,0.5,0.5,0.5,0.35625,0.2609375,0.3546875,0.590625]
  const avg = 0.5293256578947368

  text.split('')
    .map(c => c.charCodeAt(0) < widths.length ? widths[c.charCodeAt(0)] : avg)
    .reduce((cur, acc) => {
      if((acc + cur) * fontSize  > containerWidth) {
          lines ++;
          cur = acc;
      }
      return acc + cur;
    }); 

  return lines;
}

Note

I used Helvetica as font-family, you can get the value of widths & avg from Measure text according to font-family you have.

harish kumar
  • 1,732
  • 1
  • 10
  • 21
  • If you check out my "Update" and "Big Update 3", you'll notice that I actually link to the algorithm you mention. I also got some work started on a getNumberOfLines function. I'm not sure if my function works in all cases (but lately I've seen it working consistently for English). More testing has to be done with different style properties passed in. Further, my method doesn't handle Chinese well at all (though I don't know the Chinese rules on overflowing). Also, this is a hail mary, but my solution doesn't handle text where you might want some words bolded (I.e. different styles). – David Jun 19 '20 at 20:09
  • As for your algorithm, you should check mine out. Yours has a bug where any overflow of the width will constitute a new line, but this isn't actually true. Sometimes, text will overflow, but the browser won't push it onto a new line. – David Jun 19 '20 at 20:10
  • I understood your concern here. Although this algorithm is not perfect, but it can give you near to the correct value. I'll try to figure out something else. – harish kumar Jun 19 '20 at 20:13
  • What styling information did you provide? I'll check it out later. – David Jun 19 '20 at 20:42
  • Found your algorithm working far better than above one. – harish kumar Jun 19 '20 at 20:48