0

I have html code like this:

div { position:absolute}
<div style="left: 90px; top: 769.265px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.912303);">a dynamic compiler for JavaScript based on our technique and we</div>
<div style="left: 90px; top: 785.869px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.921039);">have measured speedups of 10x and more for certain benchmark</div>
<div style="left: 90px; top: 802.473px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.894838);">programs.</div>
<div style="left: 90px; top: 828.331px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.947363);">Categories and Subject Descriptors</div>
<div style="left: 327.581px; top: 828.48px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(1.00068);">D.3.4 [</div>
<div style="left: 371.618px; top: 828.63px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.944797);">Programming Lan-</div>
<div style="left: 90px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.857653);">guages</div>
<div style="left: 132.037px; top: 845.085px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.898342);">]: Processors —</div>
<div style="left: 231.234px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.909762);">Incremental compilers, code generation</div>
<div style="left: 469.214px; top: 845.085px; font-size: 14.944px; font-family: sans-serif;">.</div>

This code rendering this one:

enter image description here

BASED ON THE COORDINATES I want to make the code like this:

a dynamic compiler for JavaScript based on our technique and we
<br />
have measured speedups of 10x and more for certain benchmark
<br />
programs.
<br />
Categories and Subject Descriptors D.3.4 [Programming Lan-
<br />
guages]: Processors — Incremental compilers, code generation.

there is any parser that do that?

For example the top px between the line with Categories and Subject Descriptors and the D.3.4 suggests they are on the same line

I tried to build parsing with JS alone but with no luck.

mplungjan
  • 169,008
  • 28
  • 173
  • 236
kfir
  • 732
  • 10
  • 22
  • _“I want to make the code like this:”_ - so, no relative positioning at all, despite the question title implication that was what you actually wanted? – CBroe Mar 05 '20 at 12:24
  • If these divs are wrapped into a common parent (with nothing else in that), you could get the `innerText` of that parent - that would get you the individual lines of text already, adding the `br` elements at the line breaks then would be fairly trivial. Only the line breaks around the square brackets would still need handling somehow then. – CBroe Mar 05 '20 at 12:27
  • I updated your question based on the comment on my answer – mplungjan Mar 05 '20 at 13:13

2 Answers2

1

Parsing coordinates we can do this

const res = document.getElementById("result");
const divs = [...document.querySelectorAll("#source div")];
let styles = [];
divs.forEach((div, i) => {
  let obj = {}
  div.getAttribute("style").split("; ").forEach(style => {
    obj[style.substring(0, style.indexOf(":")).trim()] = style.substring(style.indexOf(":") + 1).trim()
  })
  styles.push(obj)
  let text = div.textContent;
  if (i > 0) {
    const diff = styles[i - 1].top.replace("px", "") - obj.top.replace("px", "");
    const diffLeft = styles[i - 1].left.replace("px", "") - obj.left.replace("px", "");
    if (Math.abs(diff) > 1) {
      res.innerHTML += "<br/>";
    } else {
      res.innerHTML += text[0].match(/[^\w]/) || Math.abs(diffLeft) < 50 ? "" : " "
    }
  }
  res.innerHTML += text;
})
section div {
  position: absolute
}
<div id="result"></div>
<hr/>
<div id="compare">a dynamic compiler for JavaScript based on our technique and we
  <br /> have measured speedups of 10x and more for certain benchmark
  <br /> programs.
  <br /> Categories and Subject Descriptors D.3.4 [Programming Lan-
  <br /> guages]: Processors — Incremental compilers, code generation.</div>

<section id="source">
  <div style="left: 90px; top: 769.265px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.912303);">a dynamic compiler for JavaScript based on our technique and we</div>
  <div style="left: 90px; top: 785.869px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.921039);">have measured speedups of 10x and more for certain benchmark</div>
  <div style="left: 90px; top: 802.473px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.894838);">programs.</div>
  <div style="left: 90px; top: 828.331px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.947363);">Categories and Subject Descriptors</div>
  <div style="left: 327.581px; top: 828.48px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(1.00068);">D.3.4 [</div>
  <div style="left: 371.618px; top: 828.63px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.944797);">Programming Lan-</div>
  <div style="left: 90px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.857653);">guages</div>
  <div style="left: 132.037px; top: 845.085px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.898342);">]: Processors —</div>
  <div style="left: 231.234px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.909762);">Incremental compilers, code generation</div>
  <div style="left: 469.214px; top: 845.085px; font-size: 14.944px; font-family: sans-serif;">.</div>
</section>

WITHOUT parsing the coordinates, I came up with this. I will leave it here.

const res = document.getElementById("result");
const divs = [...document.querySelectorAll("#source div")].map(div => div.innerText)

res.innerHTML = divs.join("<br/>")
//      .replace("-<br/>","")
  .replace("—<br/>","—")
  .replace(/<br\/>([\]\[\.,\?\!])+/g,"$1")
  .replace(/([\]\[])+<br\/>/g,"$1")
section div { position: absolute }
<div id="result"></div>
<hr/>
<div id="compare">a dynamic compiler for JavaScript based on our technique and we
<br />
have measured speedups of 10x and more for certain benchmark
<br />
programs.
<br />
Categories and Subject Descriptors D.3.4 [Programming Lan-
<br />
guages]: Processors — Incremental compilers, code generation.</div>

<section id="source">
  <div style="left: 90px; top: 769.265px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.912303);">a dynamic compiler for JavaScript based on our technique and we</div>
  <div style="left: 90px; top: 785.869px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.921039);">have measured speedups of 10x and more for certain benchmark</div>
  <div style="left: 90px; top: 802.473px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.894838);">programs.</div>
  <div style="left: 90px; top: 828.331px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.947363);">Categories and Subject Descriptors</div>
  <div style="left: 327.581px; top: 828.48px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(1.00068);">D.3.4 [</div>
  <div style="left: 371.618px; top: 828.63px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.944797);">Programming Lan-</div>
  <div style="left: 90px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.857653);">guages</div>
  <div style="left: 132.037px; top: 845.085px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.898342);">]: Processors —</div>
  <div style="left: 231.234px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.909762);">Incremental compilers, code generation</div>
  <div style="left: 469.214px; top: 845.085px; font-size: 14.944px; font-family: sans-serif;">.</div>
</section>
mplungjan
  • 169,008
  • 28
  • 173
  • 236
0

See comments inline:

var results = document.createElement("section");

// Get all the div elements and loop over them
document.querySelectorAll("div").forEach(function(div){
  // Create a new text node and populate with the content of the div
  let t = document.createTextNode(div.textContent);
  
  // Create a <br>
  let b = document.createElement("br");
  
  // Append the elements to the parent
  results.appendChild(t);
  results.appendChild(b);
  
  // Remove the original div from the document
  div.remove();
});

// Append the results to the page
document.body.appendChild(results);

console.log(results.innerHTML);
<div style="left: 90px; top: 769.265px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.912303);">a dynamic compiler for JavaScript based on our technique and we</div>
<div style="left: 90px; top: 785.869px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.921039);">have measured speedups of 10x and more for certain benchmark</div>
<div style="left: 90px; top: 802.473px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.894838);">programs.</div>
<div style="left: 90px; top: 828.331px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.947363);">Categories and Subject Descriptors</div>
<div style="left: 327.581px; top: 828.48px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(1.00068);">D.3.4 [</div>
<div style="left: 371.618px; top: 828.63px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.944797);">Programming Lan-</div>
<div style="left: 90px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.857653);">guages</div>
<div style="left: 132.037px; top: 845.085px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.898342);">]: Processors —</div>
<div style="left: 231.234px; top: 845.234px; font-size: 14.944px; font-family: sans-serif; transform: scaleX(0.909762);">Incremental compilers, code generation</div>
<div style="left: 469.214px; top: 845.085px; font-size: 14.944px; font-family: sans-serif;">.</div>
Scott Marcus
  • 64,069
  • 6
  • 49
  • 71
  • They want `Categories and Subject Descriptors D.3.4 [Programming Lan-` and `guages]: Processors — Incremental compilers, code generation.` as one un-broken line each though … – CBroe Mar 05 '20 at 12:29