0

I'm looking for effective way of multiline whitespace-stripping of text string

It should

  1. Replace all \r with \n.
  2. Remove any leading and trailing whitespaces in each line.
  3. Remove empty and only-whitespace lines.
  4. Replace all whitespace-sequencies in each line with a single space.
  5. Respect all unicode whitespace characters.

So, for a given string

var string = ' \n\t \r  \r  \xA0\n <1>  \r     \n\r\r\n\n   <2> \t \t \r \t  \r \r <3>   \n  <a    a   a   a> \r \r \r \r\t  \n   \n  ';

It should return

"<1>\n<2>\n<3>\n<a a a a>"

So far I came up to this:

string
    .replace(/[ \f\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000]+/g, ' ')
    .replace(/ ?[\n\r][\n\r ]*/g, '\n')
    .replace(/^\n|\n$/g, '')
;

Can you suggest a "better" way?

Please, do not suggest .split().map().join()s

disfated
  • 10,633
  • 12
  • 39
  • 50
  • I would use `split('\n')` to split it up by line, iterate over the lines, and then `join` them back to gether. – Barmar May 21 '15 at 05:30
  • I would read the question to the end before answering... – disfated May 21 '15 at 05:35
  • gee, take a step back. @Barmar was giving you suggestions from his every-day experience, not an SO Answer! – Lorenz Lo Sauer May 21 '15 at 05:38
  • 1
    Sorry, I didn't see that last line. But I don't understand why you're against that, it seems like an arbitrary restriction. – Barmar May 21 '15 at 05:44
  • This regex approach is valid. It is working. What enhancement are you looking for? – Wiktor Stribiżew May 21 '15 at 07:26
  • Less code / more performant / less trivial... It is about the "art of coding". Doing trivial things in different manner. [Look](http://stackoverflow.com/questions/7624920/number-sign-in-javascript). – disfated May 21 '15 at 11:11

0 Answers0