1

I've a long text (More than 10,000 words) contains html tags stored in a string.

And want to wrap every 1000 words with <div class="chunk"></div> with considering auto close opened html tags and auto open closed html tags in the different chunks.

I found many solutions but they depend on the number of characters and don't consider auto open/close html tags.

Also the php function wordwrap neglects fixing html tags problem.

Simulation

<div id="long-text">
   Dynamic long text more than 10,000 words (Text contains HTML (img, p, span, i, ...etc) tags)
</div>

Wrong result

<div id="long-text">
   <div class="chunk">
      <p>Chunk 1 : first approximately 1000 words with their html tags
         <img src="image.jpg"> ## Unclosed <p> tag ##
   </div>

   <div class="chunk">
      ## The closed <p> tag of the previous chunk ## 
      </p><p>Chunk 2 : second approximately 1000 words with their html tags
      <img src="image.jpg"> </p><p> ## unclosed <p> tag ##
   </div>

   <div class="chunk">
      ## Missing open <p> tag because it was cut in the previous chunk ##
      Chunk 3 : third approximately 1000 words with their html tags</p>
   </div>
</div>

Expected result

<div id="long-text">
       <div class="chunk">
          <p>Chunk 1 : first approximately 1000 words with their html tags
             <img src="image.jpg"> </p>
       </div>

       <div class="chunk">
          <p>Chunk 2 : second approximately 1000 words with their html tags
          <img src="image.jpg"> </p>
       </div>

       <div class="chunk">
          <p>Chunk 3 : third approximately 1000 words with their html tags</p>
       </div>
    </div>

And then i can paginate the result with javascript.

After searching i found the accepted answer here: Shortening text tweet-like without cutting links inside cutting the text (from the start only) and auto close opened html tags.

I tried to modify the code to auto open closed tags if i cut from the middle of the text but unfortunately i failed to do the job.

I don't mind if there are another better solutions to paginate the long text according to the number of words using (php or javascript or both of them).

Community
  • 1
  • 1
semsem
  • 1,194
  • 12
  • 24
  • 1
    xy problem, why do you have a 10k work long text? – madalinivascu Apr 03 '17 at 08:06
  • A database stored documentations and datasheets, each documentation has min 3k words and some have 10k words. – semsem Apr 03 '17 at 08:11
  • 1
    a solution will be to load all 10k words and use js to show only a part of that using ether a scroll or a infinite scroll type implementation – madalinivascu Apr 03 '17 at 08:14
  • 1
    I doubt there's a simple solution to this when there's HTML involved in the text. Imagine the scenario `

    1.2 k long paragraph

    `? You will need to break that up into 2 `

    ` tags one with 1k words and one with 200 words.
    – apokryfos Apr 03 '17 at 08:25
  • @madalinivascu this is a nice solution but the pages is more suitable for this work. – semsem Apr 03 '17 at 09:29

1 Answers1

2

So the idea is to use JQuery to chunk the immediate children via cloning and splitting the internal text. It may need some more work for further nested HTML but it's a start:

function chunkText(length) {
    var words = $(this).text().split(" ");
  var res = [$(this)];
  if (words.length > br) {  
    var overflow = $(this).clone();            
    var keepText = words.slice(0,length);
    $(this).text(keepText.join(" "));
    overflow.text(words.slice(length).join(" "));    
    res = res.concat(chunkText.call(overflow, length));

    } 
  return res;
}

var br = 10; //Words to split on

$("#long-text > *").each( function () {
        var chunks = chunkText.call(this,br);
    $.each(chunks, function (i,v) {
      $("#long-text")
          .append($("<div>").addClass("chunk").append(v))
          .append($("<img>").attr("src","image.jpg")));
    });

});

Basic demo: https://jsfiddle.net/o2d8zf4v/

apokryfos
  • 38,771
  • 9
  • 70
  • 114