15

I am using php to output some rich text. How can I strip out the inline styles completely?

The text will be pasted straight out of MS Word, or OpenOffice, and into a which uses TinyMCE, a Rich-Text editor which allows you to add basic HTML formatting to the text. However I want to remove the inline styles on the

tags (see below), but preserve the

tags themselves.

<p style="margin-bottom: 0cm;">A patrol of Zograth apes came round the corner, causing Rosette to pull Rufus into a small alcove, where she pressed her body against his. &ldquo;Sorry.&rdquo; She said, breathing warm air onto the shy man's neck. Rufus trembled.</p>
<p style="margin-bottom: 0cm;">&nbsp;</p>
<p style="margin-bottom: 0cm;">Rosette checked the coast was clear and pulled Rufus out of their hidey hole. They watched as the Zograth walked down a corridor, almost out of sight and then collapsed next to a phallic fountain. As their bodies hit the ground, their guns clattered across the floor. Rosette stopped one with her heel and picked it up immediately, tossing the other one to Rufus. &ldquo;Most of these apes seem to be dying, but you might need this, just to give them a helping hand.&rdquo;</p>
Onion
  • 165
  • 1
  • 1
  • 6

10 Answers10

30

I quickly put this together, but for 'inline styles' (!) you will need something like

$text = preg_replace('#(<[a-z ]*)(style=("|\')(.*?)("|\'))([a-z ]*>)#', '\\1\\6', $text);
Jake N
  • 10,535
  • 11
  • 66
  • 112
20

Here is a preg_replace solution I derived from Crozin's answer. This one allows for attributes before and after the style attribute fixing the issue with anchor tags.

$value = preg_replace('/(<[^>]*) style=("[^"]+"|\'[^\']+\')([^>]*>)/i', '$1$3', $value);
Collin James
  • 9,062
  • 2
  • 28
  • 36
  • 1
    Great response, the accepted solution is also ok but deletes too much in some tags like a(it removes attributes like href). This solution is better – felipep Mar 14 '14 at 09:40
  • This solution is best because it does not only affect 1 letter tags (p, a etc), it also affects the others (div, span etc) – Diego Somar Mar 04 '17 at 16:14
7

Use HtmlPurifier

troelskn
  • 115,121
  • 27
  • 131
  • 155
3

You could use regular expressions:

$text = preg_relace('#<(.+?)style=(:?"|\')?[^"\']+(:?"|\')?(.*?)>#si', '<a\\1 \\2>', $text);
Crozin
  • 43,890
  • 13
  • 88
  • 135
  • 3
    see this http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Alon Gubkin Mar 21 '10 at 22:27
  • Thanks, but that line doesnt work. I get the error: Parse error: syntax error, unexpected '[' in ... (etc filename) – Onion Mar 22 '10 at 22:19
  • I've forgotten to add escape chars before `'` ;) – Crozin Mar 22 '10 at 22:35
  • Hi Crozin, not sure where I should add an escape character? Do you mean a \ ? – Onion Mar 22 '10 at 22:54
  • @Alon, see the second answer on that page: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1733489#1733489 . He has some known HTML which is being reliably generated, so a regex is not a bad solution here. – nickf Mar 22 '10 at 23:31
3

You can use: $content = preg_replace('/style=[^>]*/', '', $content);

3

You can also use PHP Simple HTML DOM Parser, as follows:

$html = str_get_html(SOME_HTML_STRING);

foreach ($html->find('*[style]') as $item) {
   $item->style = null;
}
Alon Gouldman
  • 3,025
  • 26
  • 29
0

Couldn't you just use strip_tags and leave in the tags you want eg <p>, <strong> etc?

niggles
  • 1,010
  • 1
  • 9
  • 21
  • No, because I want to keep the

    tags, but I don't want any with inline styles, eg

    It's the inline style I want to remove without removing the

    – Onion Mar 22 '10 at 22:10
0

Why don't you just overwrite the tags. So you will have clean tags without inline styling.

Sinan
  • 5,819
  • 11
  • 39
  • 66
0

I found this class very useful for doing strip attributes (especially where there's crazy MS Word formatting all through the text):

http://semlabs.co.uk/journal/php-strip-attributes-class-for-xml-and-html

niggles
  • 1,010
  • 1
  • 9
  • 21
0

I am did need to clear style from img tags and did resolved by this code:

$text = preg_replace('#(<img (.*) style=("|\')(.*?)("|\'))([a-z ]*)#', '<img \\2\\6', $text);
echo  $text;
Aladaghlou
  • 83
  • 5