0

I have a long String like this.

<p>Some Text above the tabular data. I hope this text will be seen.</p>

<table border="1" cellpadding="0" cellspacing="0">
    <tbody>
        <tr>
            <td style="width:150px">
            <p>S.No.</p>
            </td>



            </td>
        </tr>
        <tr>
            <td style="width:150px">
            <p>2</p>
            </td>


    </tbody>
</table>

<p>&nbsp;</p>

<p>Please go through this tabular data.</p>

<table border="1" cellpadding="0" cellspacing="0">
    <tbody>
        <tr>
            <td style="width:150px">
            <p>S.No.</p>
            </td>


        </tr>
        <tr>
            <td style="width:150px">
            <p>1</p>
            </td>


        <tr>
            <td style="width:150px">
            >
            </td>

            </td>
        </tr>
    </tbody>
</table>


<p>End Of String</p>

Now I want to extract whole string before html table and after it like this. And add "HTML Table..." inplace of HTML Table. I tried few things but not able to achive it. Tried splitting into arrays, but it didn't worked

Sample Output

<p>Some Text above the tabular data. I hope this text will be seen.</p>

<p>&nbsp;</p>
HTML Table.... 
<p>Please go through this tabular data.</p>


<p>End Of String</p>
John
  • 276
  • 1
  • 9
  • 29

2 Answers2

0

You can do this simply with String.replaceAll using regexp handling multiline and case-insensitive flags (?is):

String noTables = longTableString.replaceAll("(?is)(\\<table .*?\\</table\\>)", "HTML Table...");
// result
<p>Some Text above the tabular data. I hope this text will be seen.</p>

HTML Table...

<p>&nbsp;</p>

<p>Please go through this tabular data.</p>

HTML Table...


<p>End Of String</p>

Nowhere Man
  • 19,170
  • 9
  • 17
  • 42
0

This is may not be the most elegant solution, you can start with using regex to capture your table locations and then replace it with the desired content. Something like below will help.

    String htmlString = <your html string> ;        
    Pattern pattern = Pattern.compile( "(<table)([\\s\\S]*?)(</table>)" ); // capture table elements using a suitable regex.
    Matcher matcher = pattern.matcher( htmlStr );
    String result = htmlStr;
    while( matcher.find() )
    {
        // replace the table elements with another string 
        result = result.replace( htmlStr.substring( matcher.start(), matcher.end() ), "HTML Table...." );
    }
    System.out.println( result ); // print output

There are few drawbacks in this approach, like your regex must match with the html content. And the spacing depends on the original string spaces. You really don't have control over how the spaces in the output will look like. And more importantly, the regex evaluation is CPU intensive depending on the size of your HTML string.

This is just an approach to try.

Klaus
  • 1,641
  • 1
  • 10
  • 22
  • Thanks! "the regex evaluation is CPU intensive depending on the size of your HTML string.". So What about its complexity. @Klaus – John May 12 '20 at 17:46
  • @Sam Read here https://stackoverflow.com/questions/5892115/whats-the-time-complexity-of-average-regex-algorithms . This really depends on the size of the input. Like, someone can provide an input that is sufficiently large enough to exhaust the cpu from evalutation the input against regex – Klaus May 12 '20 at 17:57