1

when i use the code:

add_filter('the_content', 'my_nofollow');
add_filter('the_excerpt', 'my_nofollow');

function my_nofollow($content) {
    return preg_replace_callback('/<a[^>]+/', 'my_nofollow_callback', $content);
}
function my_nofollow_callback($matches) {
    $link = $matches[0];
    $site_link = get_bloginfo('url');
    if (strpos($link, 'rel') === false) {
        $link = preg_replace("%(href=\S(?!$site_link))%i", 'rel="nofollow" $1', $link);
    } elseif (preg_match("%href=\S(?!$site_link)%i", $link)) {
        $link = preg_replace('/rel=\S(?!nofollow)\S*/i', 'rel="nofollow"', $link);
    }
    return $link;
}

only in the_content and the_excerpt the links get a nofollow attribute. How i can edit the code, that the whole wordpress site use this functions (footer, sidebar...)

Thank you

Klaus
  • 11
  • 2
  • You're on the right track but you [shouldn't parse html with regex](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). You could do this with [DomDocument](https://www.php.net/manual/en/class.domdocument.php) – TheGentleman Jul 21 '20 at 15:04
  • Does this answer your question? [How to add rel="nofollow" to links with preg\_replace()](https://stackoverflow.com/questions/5037592/how-to-add-rel-nofollow-to-links-with-preg-replace) – Toto May 22 '21 at 09:12

2 Answers2

3

a good script which allows to add nofollow automatically and to keep the other attributes

function nofollow(string $html, string $baseUrl = null) {
    return preg_replace_callback(
            '#<a([^>]*)>(.+)</a>#isU', function ($mach) use ($baseUrl) {
                list ($a, $attr, $text) = $mach;
                if (preg_match('#href=["\']([^"\']*)["\']#', $attr, $url)) {
                    $url = $url[1];
                    if (is_null($baseUrl) || !str_starts_with($url, $baseUrl)) {
                        if (preg_match('#rel=["\']([^"\']*)["\']#', $attr, $rel)) {
                            $relAttr = $rel[0];
                            $rel = $rel[1];
                        }
                        $rel = 'rel="' . ($rel ? (strpos($rel, 'nofollow') ? $rel : $rel . ' nofollow') : 'nofollow') . '"';
                        $attr = isset($relAttr) ? str_replace($relAttr, $rel, $attr) : $attr . ' ' . $rel;
                        $a = '<a ' . $attr . '>' . $text . '</a>';
                    }
                }
                return $a;
            },
            $html
    );
}
Redouane
  • 33
  • 6
0

The idea is to add your action in the full html buffer, once everything is loaded and filtered, not only in the 'content' area. BE VERRY CAREFULL WITH THIS !!!! This action is verry RAW and low-level... You can really mess up everything with theses buffer actions. But it's also very nice to know ;-) It works like this :

add_action( 'wp_loaded', 'buffer_start' );  function buffer_start() { ob_start( "textdomain_nofollow" ); }
add_action( 'shutdown', 'buffer_end' );     function buffer_end()   { ob_end_flush(); }

Then do the replace action and setup the callback : Juste imagine the $buffer variable as your full site in HTML, as it will load.

function textdomain_nofollow( $buffer ) {
    return preg_replace_callback( '/<a[^>]+/', 'textdomain_nofollow_callback', $buffer );
}

Here is how I do the href checking (read my comments) :

function textdomain_nofollow_callback( $matches ) {
    $link = $matches[0];

    // if you need some specific external domains to exclude, just use :
    //$exclude = '('. home_url() .'|(http|https)://([^.]+\.)?(domain-to-exclude.org|other-domain-to-exclude.com)|/)';
        
        // By default, just exclude your own domain, and your relative urls that starts by a '/' (if you use relatives urls)
            $exclude = '('. home_url() .'|/)';
            if ( preg_match( '#href=\S('. $exclude .')#i', $link ) )
                    return $link;
        
            if ( strpos( $link, 'rel=' ) === false ) {
                    $link = preg_replace( '/(?<=<a\s)/', 'rel="nofollow" ', $link );
            } elseif ( preg_match( '#rel=\S(?!nofollow)#i', $link ) ) {
                    $link = preg_replace( '#(?<=rel=.)#', 'nofollow ', $link );
            }
        
    return $link;
        
}

It works well in my experience... Q

Dharman
  • 30,962
  • 25
  • 85
  • 135