-2

I've tried all the solutions in Stackoverflow, but it doesn't really work.


Input 1: ' <ul> <li>Lorem Delor </li> </ul> '

Expected Output 1: '<ul><li>Lorem Delor</li></ul>'


Input 2: ' <ul> <li>Lorem <b>Ipsum</b> Delor </li> </ul> '

Expected Output 2: '<ul><li>Lorem <b>Ipsum</b> Delor</li></ul>'

Solutions in Stackoverflow: '<ul><li>Lorem<b>Ipsum</b>Delor</li></ul>'


Input 3:

   Stack

    overflow 

Expected Output 3:

   Stack

    overflow 

Many regex solutions ignore inline elements. That's why the words on the page become unified (Input 2). I wonder if there really is a clear solution to this.

Important: This should only affect the html input, not the plain text. (Input 3)

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Zeyya
  • 7
  • 1
    You just need an html minifier, i don't know which language you are using for your server, but i'm sure there is a package that do that. Example with php : https://github.com/jenstornell/tiny-html-minifier – Lk77 Aug 18 '22 at 09:43
  • Why would you need to do this? if you are wanting to save file size, wouldn't it be better to serve the pages with gzip compression rather than minifying your html with js? – Pete Aug 18 '22 at 10:54
  • Thanks @Pete , but it's not about the file size. I needed such a structure in my project. – Zeyya Aug 18 '22 at 11:13
  • 2
    That was my question though - why would you "need" that structure if not for file size? Browsers don't care about space between elements so they wouldn't get rendered so why does your project need it? – Pete Aug 18 '22 at 11:19
  • 1
    By the time JavaScript removes the empty space between tags the page is usually already loaded and rendered. So it will not save any recources by removing spaces after the fact. Right-click and view source will still show the server-side answer. What is this trying to improve? Or just a (home)work assignment? – Peter Krebs Aug 18 '22 at 11:36

2 Answers2

1

You can use these two regular expressions which removes end of lines and spaces.

const input = `
  <ul>   
      <li>Lorem Delor  </li>
      <li>Lorem Delor  </li> 
  </ul>
`;

const output = input
    // remove eols between tags
    .replace(/\>[\r\n ]+\</g, "><")
    // remove spaces between tags
    .replace(/(<.*?>)|\s+/g, (m, $1) => $1 || ' ')
    .trim();
    
console.log(output);

In your question example you want to remove every space before end of tag but I find it unwanted. That space can be placed intentionally (it can be inline element and you might want to keep that space). So the second regular leaves one space before end tag if there was one or more spaces before. If you really want to remove all spaces (you shouldn't) just replace ' ' with ''.

Regex source

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Jax-p
  • 7,225
  • 4
  • 28
  • 58
0

use regex /\s+/gim to remove multiple spaces.

txt.replace(/\s+/gim, ' ')

use regex />\s+</gim to remove spaces between > <.

txt.replace(/>\s+</gim, '><')

Code:

var input1 = '   <ul>   <li>Lorem Delor  </li>  </ul>  ';
var input2 = `   <ul>   <li>Lorem <b>Ipsum</b> Delor  </li>  </ul>  `;

console.log(input1.replace(/\s+/gim, ' ').trim().replace(/>\s+</gim, '><'));
console.log(input2.replace(/\s+/gim, ' ').trim().replace(/>\s+</gim, '><'));

Output:

'<ul><li>Lorem Delor </li></ul>'
'<ul><li>Lorem <b>Ipsum</b> Delor </li></ul>'
Art Bindu
  • 769
  • 4
  • 14