How Can I delete begins with and ends with

Question

I want to delete <h1>xxx yyyy zzz </h1> with php. But, first, I want to control if string starts with <h1> and ends with </h1>

Is there a function for this purspose?

if(string begins with '<h1>' and ends with '</h1>'){

    replace `<h1>`xxx yyyy zzz `</h1>` to 'NULL or empty space'

}

I'm sorry.. this isn't completely clear. What do you mean by `control`? — christopher, Feb 13 '14 at 21:00
`preg_replace()` will do the trick :) Regular expressions rules! — cyadvert, Feb 13 '14 at 21:01
"Regular expressions rules!"? http://stackoverflow.com/a/1732454/1447657 — Jonathan, Feb 13 '14 at 21:03

score 4 · Accepted Answer · answered Feb 13 '14 at 21:01

4

What about just using a regular expression?

$string = preg_replace( "/<h1>(.*?)<\\/h1>/", "", $string );

The *? is to make it non-greedy

answered Feb 13 '14 at 21:01

redolent

4,159
5
37
47

Is that a typo? should be `<\\/h1>` – mehmetseckin Feb 13 '14 at 21:03
1

Oops. Thanks for catching that. – redolent Feb 13 '14 at 21:03
3

This will also remove `
skjgkdjjfkkjdfjdf
jdfjdfdfjdf
jfjdfjfd
`. – Amal Murali Feb 13 '14 at 21:04
@amal-murali, Having the `*?` will prevent that issue – redolent Feb 13 '14 at 21:07
3

Regex is not a tool that can be used to correctly parse HTML. regex-infection will devour your HTML parser,ichor permeates all MY FACE MY FACE ᵒh god no NO NOO̼OO NΘ stop the an*̶͑̾̾̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e not rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ See http://stackoverflow.com/a/1732454/325521 – Shiva Feb 13 '14 at 21:09
Why do people try to manipulate HTML with Regex? It's insane. Use a parser such as [Simple HTML DOM](http://simplehtmldom.sourceforge.net/) if you are using PHP – CommandZ Feb 17 '14 at 03:46
For small tasks, regex is more lightweight and possibly easier to read. – redolent Feb 17 '14 at 19:30
@Amal this won't remove because the `*?` operator is non-greedy – redolent Jul 29 '16 at 18:03

score 2 · Answer 2 · answered Feb 13 '14 at 21:08

Regular expressions are not the right tool for this job. Use a DOM parser to parse HTML. Here's a solution using the built-in DOMDocument class.

$dom = new DOMDocument;
$your_html_string = '<h1>xxx yyyy zzz </h1>';
$dom->loadHTML($your_html_string);

$h1_tags = $dom->getElementsByTagname('h1');

// array of elements that are to be removed
$remove = array();
foreach ($h1_tags as $tag) {
    $remove[] = $tag;
}

// remove them
foreach($h1_tags as $tag) {
    $tag->parentNode->removeChild($tag);
}

// remove the DOCTYPE/html/body tags that DOM adds by default
$html = preg_replace(
    '~<(?:!DOCTYPE|/?(?:html|head|body))[^>]*>\s*~i', '', $dom->saveHTML()
);

echo $html;

Demo

Omar Himada · Answer 3 · 2014-02-13T21:29:16.170

0

<?php
    $string = '<h1>AbcXycKasOkasdMpal</h1>';
    $pattern = '/<h1>.*<\/h1>/i';
    $replacement = '';
    echo preg_replace($pattern, $replacement, $string);
?>

By using regular expressions and the PHP preg_replace function you can pinpoint and replace every string starting with <h1> and ending </h1> with a blank string.

If you don't want to replace, look into preg_match.

Edit: changed to fix what @SyntaxLAMP pointed out.

edited Feb 13 '14 at 21:29

answered Feb 13 '14 at 21:06

Omar Himada

2,540
1
14
31

1

<\/h1> you mean. Gotta escape – SyntaxLAMP Feb 13 '14 at 21:27

How Can I delete begins with and ends with

3 Answers3

skjgkdjjfkkjdfjdf

jfjdfjfd