How can I get the preceding text in the following ES6 loop?

Question

I'm writing a Markdown parser with ES6:

Input:

# Title

* * *

Paragraph

Another paragraph

Sample code:

// create a find/replace based on a regex and replacement 
// to later be used with the conversion functions
function massReplace(text, replacements) {
  let result = text
  for (let [regex, replacement] of replacements) {
    result = result.replace(regex, replacement)
  }
  return result
}

// match text with # and replace them for html headings
function convertHeadings(text, orig) {
  if (orig.match(/^#{1,6}\s/)) {
    return massReplace(text,
       [/^### (.*)/gm,    '<h3>$1</h3>'],
       [/^## (.*)/gm,     '<h2>$1</h2>'],
       [/^# (.*)/gm,      '<h1>$1</h1>'] ]
    )
  }
}

// match text without # and surround them with p tags
function convertParagraphs(text, orig) {
  if (!orig.match(/^#{1,6} (.*)/)) {
    return `<p>${text}</p>`
  }
}

// take the source, split on new lines, make a copy (to 
// have a "clean" version to be used in the if statements),
// and finally apply the conversion functions to them with
// the help of a loop and excluding those that output undefined
function convertToHTML(markdownSource) {
  let data = markdownSource.split('\n\n')
    , orig = data.slice()
    , conversions = [ convertHeadings, convertParagraphs]

  for (let i = 0, l = orig.length; i < l; ++i) {
    for (let conversion of conversions) {
      let result = conversion(data[i], orig[i])
      if (result !== undefined) {
        data[i] = result
      }
    }
  }

  return data.join('\n\n')
}

What I want now is to wrap p tags with the class no-indent around text that has * * * preceding it (Paragraph in the example above). The problem is, I don't know how to get a text based on its preceding one (* * * in this case).

To give an idea this is the desired output:

<h1>Title</h1>

<p>* * *</p>

<p class="no-indent">Paragraph</p>

<p>Another paragraph</p>

can you describe your quesion moe.how is functions `massReplace` and `convertParagraphs` called. and it's better to write your input as HTML — Omar Elawady, Apr 10 '15 at 15:00
@Omar Elawady How about now? I don't understand why you suggest to write the input in HTML though, since the real input should be in Markdown. Anyhow, I included the HTML part as output. — alexchenco, Apr 10 '15 at 15:10
(@omar seems a bit lost... :) ) Upvoted your Q. It's a really good formatted one! Thumbs up — Roko C. Buljan, Apr 10 '15 at 15:15
_“The problem is, I don't know how to get a text based on its preceding one”_ – don’t try to look _forward_, look _back_ instead. When you encounter the `Paragraph` line in your input data, consult the variable that you stored the value of the _previous_ line into. — CBroe, Apr 10 '15 at 15:23
You could perhaps look at parsers and how they are coded, however otherwise would a split line regex do the trick http://stackoverflow.com/questions/1979884/how-to-use-javascript-regex-over-multiple-lines — user5321531, Apr 10 '15 at 22:13

score 1 · Answer 1 · edited May 23 '17 at 11:50

You're asking a question about tokenising and parsing, in particular possibly look ahead parsing:

Wikipedia pages:
https://en.wikipedia.org/wiki/Lexical_analysis
https://en.wikipedia.org/wiki/Parsing

StackOverflow tokenizing:
https://stackoverflow.com/questions/tagged/token
https://stackoverflow.com/questions/tagged/tokenize

StackOverflow parsing questions:
https://stackoverflow.com/questions/tagged/parsing

How can I get the preceding text in the following ES6 loop?

1 Answers1