Using PHP i want to remove all HTML attributes except
"src" attribute from "img" tag
and
"href" attribute from "a" tag
My Input file is .html file which is been converted from .doc and .docx
My output file again should be HTML file with removed attribute
Kindly help me please
Edit ::
After Trying alexander script as below if i open the strip.html in code editor i don't see any changes
<?php
$path = '/var/www/strip.html';
$html = file_get_contents($path);
$dom = new DOMDocument();
$dom->strictErrorChecking = false;
$dom->formatOutput = true;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
if (false === ($elements = $xpath->query("//img"))) die('Error');
foreach ($elements as $element) {
for ($i = $element->attributes->length; --$i >= 0;) {
$name = $element->attributes->item($i)->name;
if ('src' !== $name) {
$element->removeAttribute($name);
}
}
}
if (false === ($elements = $xpath->query("//a"))) die('Error');
foreach ($elements as $element) {
for ($i = $element->attributes->length; --$i >= 0;) {
$name = $element->attributes->item($i)->name;
if ('href' !== $name) {
$element->removeAttribute($name);
}
}
}
$dom->saveHTMLFile($path);
?>