0

Using javascript, I'd like to break up a string of arbitrary length into segments that are 80 characters max. The caveat being that I don't want to break words. For example, i am currently using the method listed here Split large string in n-size chunks in JavaScript

var dialog_array = dialog_to_load.match(/.{1,80}/g);

The issue being, that a word that begins on the 76th character and ends on the 84th character will be broken in half. Is there a sleek bit of regex or code to prevent this?

To clarify, I am capable of writing a small function to achieve this, I'm just wondering if there's a clean, sleeker way.

The string would be of arbitrary length and content, but here's an example at request:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Community
  • 1
  • 1
Bryant Makes Programs
  • 1,493
  • 2
  • 17
  • 39

2 Answers2

2

How about using e.g. /.{1,80}\b/g to respect word boundary ?

trincot
  • 317,000
  • 35
  • 244
  • 286
Janne
  • 1,665
  • 15
  • 22
  • This is a very nice idea! Just in the rare event that there would not be a word break before reaching the 80 character limit, characters at the beginning of the line would be lost. But with 80 characters that is quite theoretical. – trincot Feb 18 '17 at 16:26
  • Also, a break could occur just before punctuation, like `word, word` could be come `['word', ', word']`. – trincot Feb 18 '17 at 16:40
  • This is almost exactly what I need! It breaks words like `that's`, and it would theoretically break `word.`. Is there a way to preserve punctuation like that? Best answer, regardless, but would be nice if that were manageable :) – Bryant Makes Programs Feb 18 '17 at 19:05
0

You could use this regular expression:

/\S.{1,79}(?=$|\s)/g

The \S ensures that a line will start with a non-space. As a consequence the count in .{1,79} needs to be one less. With look-ahead (?= it is ensured that the match stops at the moment there is white-space ahead, or the end of the string ($).

When used with match() you get the lines as requested, with spaces removed at the position where a line break occurs.

The snippet below uses 50 as width instead of 80, so it renders well:

var s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
var res = s.match(/\S.{1,49}(?=$|\s)/g);
console.log(res);
.as-console-wrapper { max-height: 100% !important; top: 0; }
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Did you check my answer? I saw your comment on the other answer about slitting `that's`. The solution I propose does not have this problem. Did you find any other issue with it? – trincot Feb 18 '17 at 19:18
  • Why the down vote? I would like to know what is wrong so I can improve... – trincot Feb 18 '17 at 19:44