I would like to make use of Simple HTML DOM parser to search for mail-adresses in the content of a html-site and replace them.
The replacement contains a span
element and a little bit JS (this should obfuscate the addresses.
At the moment this works as follows:
$pattern =
"/(?:[a-z0-9!#$%&'*+=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])/";
preg_match_all( $pattern, $content, $matches );
foreach ( $matches[ 0 ] as $email ) {
$content = $this->searchDOM(
$content,
$email,
$this->hide_email($email)
);
}
This is the searchDOM
-method:
private function searchDOM( $content, $search, $replace, $excludedParents = [] )
{
$dom = HtmlDomParser::str_get_html(
$content,
true,
true,
DEFAULT_TARGET_CHARSET,
false,
DEFAULT_BR_TEXT,
DEFAULT_SPAN_TEXT
);
foreach ( $dom->find( 'text' ) as $element ) {
if ( !in_array( $element->parent()->tag, $excludedParents ) ) {
$element->innertext = preg_replace(
'/(?<!\w)' . preg_quote( $search, "/" ) . '(?!\w)/i',
$replace,
$element->innertext
);
}
}
return $dom->save();
}
and this is the hide_email-method:
function hide_email( $email )
{
$character_set = '+-.0123456789@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz';
$key = str_shuffle( $character_set );
$cipher_text = '';
$id = 'e' . rand( 1, 999999999 );
for ( $i = 0; $i < strlen( $email ); $i += 1 )
$cipher_text .= $key[ strpos( $character_set, $email[ $i ] ) ];
$script = 'var a="' . $key . '";var b=a.split("").sort().join("");var c="' . $cipher_text . '";var d="";';
$script .= 'for(var e=0;e<c.length;e++)d+=b.charAt(a.indexOf(c.charAt(e)));';
$script .= 'document.getElementById("' . $id . '").innerHTML="<a href=\\"mailto:"+d+"\\">"+d+"</a>"';
$script = "eval(\"" . str_replace( [ "\\", '"' ], [ "\\\\", '\"' ], $script ) . "\")";
$script = '<script type="text/javascript">/*<![CDATA[*/' . $script . '/*]]>*/</script>';
return '<span id="' . $id . '">[javascript protected email address]</span>' . $script;
}
Well - this is not working as expected, because the rendered page shows only "[javascript protected email address]". If I have a look at the source, the a
-tag is missing.