two<div
class="blogger-post-footer"><img
width='1' height='1'
src='https://blogger.googleusercontent.com/tracker/4997742813462440000-8247376481926663915?l=isthereanyurlnamesleft.blogspot.com'
alt='' /></div>
I need to match from <div
to </div>
two<div
class="blogger-post-footer"><img
width='1' height='1'
src='https://blogger.googleusercontent.com/tracker/4997742813462440000-8247376481926663915?l=isthereanyurlnamesleft.blogspot.com'
alt='' /></div>
I need to match from <div
to </div>
You can use (<div.*?<\/div>)
. In the first backreference/group you get the match from <div
to </div>
.
You must use the /s flag (or an equivalent for the language you use) with this regex to let the . match newlines. Documentation about /s says:
To simplify multi-line substitutions, the "." character never matches a newline unless you use the /s modifier, which in effect tells Perl to pretend the string is a single line--even if it isn't.
You may try the below:
<div.*?</div>
Be sure to have DOTALL
or SINGLELINE
mode.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS * IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Above message for: RegEx match open tags except XHTML self-contained tags
<(?:([a-zA-Z\?][\w:\-]*)(\s(?:\s*[a-zA-Z][\w:\-]*(?:\s*=(?:\s*"(?:\\"|[^"])*"|\s*'(?:\\'|[^'])*'|[^\s>]+))?)*)?(\s*[\/\?]?)|\/([a-zA-Z][\w:\-]*)\s*|!--((?:[^\-]|-(?!->))*)--|!\[CDATA\[((?:[^\]]|\](?!\]>))*)\]\])>
this a template for extract any html brought from http://gskinner.com/RegExr/