0

I have an HTML file in sublime text and I am trying to add a newline character after every 3rd </tr> tag to group them into sets of 3 (this only needs to be pleasing to my eyeballs and will not be displayed anywhere in any web page how can I do this with regex? I can get all the tags easy enough with (</tr>) but I want every 3rd one only to replace with $1\n

some example data

   <tr data-id="17622538">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28807">1d</th>
        <td>Order 4990792 for symbol NASDAQ:OCUL has been executed at price 7.08 for 2000 shares</td>
    </tr>
    <tr data-id="17622537">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28808">1d</th>
        <td>Order 4990792 successfully placed</td>
    </tr>
    <tr data-id="17622536">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28809">1d</th>
        <td>Call to place market order to buy 2000 shares of symbol NASDAQ:OCUL </td>
    </tr>
    <tr data-id="17622538">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28807">1d</th>
        <td>Order 4990792 for symbol NASDAQ:OCUL has been executed at price 7.08 for 2000 shares</td>
    </tr>
    <tr data-id="17622537">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28808">1d</th>
        <td>Order 4990792 successfully placed</td>
    </tr>
    <tr data-id="17622536">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28809">1d</th>
        <td>Call to place market order to buy 2000 shares of symbol NASDAQ:OCUL </td>
    </tr>
    <tr data-id="17622538">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28807">1d</th>
        <td>Order 4990792 for symbol NASDAQ:OCUL has been executed at price 7.08 for 2000 shares</td>
    </tr>
    <tr data-id="17622537">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28808">1d</th>
        <td>Order 4990792 successfully placed</td>
    </tr>
    <tr data-id="17622536">
        <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28809">1d</th>
        <td>Call to place market order to buy 2000 shares of symbol NASDAQ:OCUL </td>
    </tr>

the desired output would look like

    <tr data-id="17622538">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28807">1d</th>
    <td>Order 4990792 for symbol NASDAQ:OCUL has been executed at price 7.08 for 2000 shares</td>
</tr>
<tr data-id="17622537">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28808">1d</th>
    <td>Order 4990792 successfully placed</td>
</tr>
<tr data-id="17622536">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28809">1d</th>
    <td>Call to place market order to buy 2000 shares of symbol NASDAQ:OCUL </td>
</tr>

<tr data-id="17622538">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28807">1d</th>
    <td>Order 4990792 for symbol NASDAQ:OCUL has been executed at price 7.08 for 2000 shares</td>
</tr>
<tr data-id="17622537">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28808">1d</th>
    <td>Order 4990792 successfully placed</td>
</tr>
<tr data-id="17622536">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28809">1d</th>
    <td>Call to place market order to buy 2000 shares of symbol NASDAQ:OCUL </td>
</tr>

<tr data-id="17622538">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28807">1d</th>
    <td>Order 4990792 for symbol NASDAQ:OCUL has been executed at price 7.08 for 2000 shares</td>
</tr>
<tr data-id="17622537">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28808">1d</th>
    <td>Order 4990792 successfully placed</td>
</tr>
<tr data-id="17622536">
    <th title="10/14/2016, 7:37:13 AM" data-ago-date-timer="28809">1d</th>
    <td>Call to place market order to buy 2000 shares of symbol NASDAQ:OCUL </td>
</tr>
Mr Lister
  • 45,515
  • 15
  • 108
  • 150
Mike
  • 69
  • 1
  • 3
  • 9
  • Please update your question with sample code, a valid corpus, and your desired output. – Todd A. Jacobs Oct 15 '16 at 19:47
  • 3
    Do not use regexes for HTML. [Seriously, do not](http://stackoverflow.com/a/1732454/1934349). – paulotorrens Oct 15 '16 at 19:47
  • errrr... I'm going to second @paulotorrens, but are you trying to insert carriage returns after ? If you are, why aren't you matching that instead of trying to match "every third tag"? – David Hoelzer Oct 15 '16 at 19:50
  • It's sublime text mate. @paulotorrens – revo Oct 15 '16 at 19:50
  • I think we get that Revo. It says so in the question. The bigger question is why are we searching for every third tag rather than the specific tags that need the extra carriage returns? If it's every third then why not match ((?!).*)<((?!).*) – David Hoelzer Oct 15 '16 at 19:53
  • @paulotorrens: I get tired to see this stupid link over and over. The accepted answer is perhaps funny, but doesn't help at all, and there's too many other answers that anyone isn't able to do a good choice. But I won't be too hard with you since all noobs do the same. To be short: 1) Yes, the html language is more complicated that people thinks. 2) It's **possible** to describe html with an advanced regex engine like PCRE. There's absolutely **no theoretical reason** to prevent it. 3) It's difficult in general (in this case a parser will help as it can), but can be also easy in particular. – Casimir et Hippolyte Oct 15 '16 at 20:39
  • @CasimiretHippolyte: maybe it's stupid, but it's true. By the time I made the comment, as you can see in the edit history, OP didn't give any example text, nor was mentioned that he was trying to do so in a code editor. Otherwise I could have suggested a regex for him in this particular scenario. Also, there _is_ a **theoretical reason** to prevent it: true, PCRE is able to parse HTML/XML (and even more, [see my question](http://stackoverflow.com/questions/38688009)), but doing so uses backrefs which are NP-complete, and potentially exponential in time, which might lead to ReDoS in general. – paulotorrens Oct 15 '16 at 21:26
  • Do people realise that inserting br elements in between tr elements is not a good idea. – Mr Lister Oct 21 '16 at 17:58

2 Answers2

3

replace:

((<tr[^<]+<th[^<]+<\/th>\s+<td[^<]+</td>\s+</tr>\s*){3})

with:

\1\n
Taha Paksu
  • 15,371
  • 2
  • 44
  • 78
1
((\n.*?)+</tr>){3} 

and then Find all, press right arrow key, you are there

Liu
  • 970
  • 7
  • 19