0

I'm a very newbie webpage builder, currently working on creating a website that needs to change link colours according to the destination page. The links will be sorted into different classes (e.g. good, bad, neutral) by certain user input criteria-- e.g. links with content the user would find of interest is colored blue, stuff that the user (presumably) doesn't want to see is colored as normal text, etc.

I reckon I need a way to parse the webpage for links to the content (stored in MySQL database), change the colors for all the links on the page (so I need to be able to change the link classes in the HTML as well) before outputting the adapted page to the user. I read that regex is not a good way to find those links-- so should I use a library, and if so, is html5lib good for what I'm doing?

Luinithil
  • 63
  • 2
  • 4
  • 10

1 Answers1

2

There's no need to complicate urself with PHP HTML parsers which will mangle and forcefully "repair" your input HTML.

Here's how you can combine PHP with javascript, complete working and tested solution:

<?php
$arrBadLinks=array(
    "http://localhost/something.png",
    "https://www.apple.com/something.png",
);
$arrNeutralLinks=array(
    "http://www.microsoft.com/index.aspx",
    "ftp://samewebsiteasyours.com",
    "ftp://samewebsiteasyours.net/file.txt",
);
?>
<html>
    <head>
        <script>
        function colorizeLinks()
        {
            var arrBadLinks=<?php echo json_encode($arrBadLinks);?>;
            var arrNeutralLinks=<?php echo json_encode($arrNeutralLinks);?>;

            var nodeList=document.getElementsByTagName("*");
            for(var n=nodeList.length-1; n>0; n--)
            {
                var el=nodeList[n];

                if(el.nodeName=="A")
                {
                    if(arrBadLinks.indexOf(el.href)>-1)
                        el.style.color="red";
                    else if(arrNeutralLinks.indexOf(el.href)>-1)
                        el.style.color="green";
                    else
                        el.style.color="blue";
                }
            }
        }

        if(window.addEventListener)
            window.addEventListener("load", colorizeLinks, false);
        else if (window.attachEvent)
            window.attachEvent("onload", colorizeLinks);
        </script>
    </head>
    <body>
        <p>
            <a href="http://www.microsoft.com/index.aspx">Neutral www.microsoft.com/index.aspx</a>
        </p>
        <p>
            <a href="http://localhost/something.png">Bad http://localhost/something.png</a>
        </p>
    </body>
</html>

Does not work for relative URLs, make sure you make them absolute, or the comparison will fail (or update the code to fill in the http://current-domain.xxx for the existing relative URL).

oxygen
  • 5,891
  • 6
  • 37
  • 69
  • Note also that there will be 0 strain on the server. All processing is done client side (equal distribution amongst clients, what more can u ask for?). – oxygen Sep 07 '12 at 15:34
  • This doesn't answer the question, which requires parsing of existing HTML, and rewriting it out – ernie Sep 07 '12 at 15:38
  • @ernie The question asks for a solution to colorize links. The solution I provided uses the already parsed DOM (by the browser), walks trough it, adds colors to anchors. And it is complete working code (would you like to run it with any HTML you want?). Parsing the existing HTML is done by the browser (are you happy something is parsing the HTML?). – oxygen Sep 07 '12 at 15:40
  • Fair enough . . . I feel it answers the question literally, but does not answer the spirit of the question as it sounds as if he wants to parse a third-parties webpage and then rewrite it, but maybe I'm assuming too much. He also mentions changing the class attributes, which your answer doesn't directly address (though it'd be easy enough to modify the DOM there too) – ernie Sep 07 '12 at 15:43
  • .className="..." :) Problably a PHP parser would allow it the same way. – oxygen Sep 07 '12 at 15:47
  • Yeah, my original down-vote is now an up-vote; pretty elegant solution for minimizing server load, though it does have some pretty tight constraints – ernie Sep 07 '12 at 15:51
  • @ernie The links to be parsed are all on one website, not 3rd party. @Tiberiu-Ionuț Stan Thank you for the solution; I was hoping to avoid using parsers if possible. I have some questions about the solution however (please bear with me, I've never worked with PHP, very little with Javascript and only mostly good with HTML and CSS): what's that `echo json_encode($arrBadLinks)` mean? – Luinithil Sep 07 '12 at 16:14
  • json_encode converts a PHP variable's value into the javascript code equivalent. Thus it is safe to use it to initialize a javascript variable (it will escape strings properly, etc). It is the right way to exchange data from PHP to Javascript. If you look at the generated code, instead of array("somelink", ...), you will see ["somelink", ...]. – oxygen Sep 07 '12 at 19:21
  • @Tiberiu-Ionuț Stan Thank you for that explanation. I tried saving your code snippet above, exactly as written in a PHP file and opened it in my browser to see the code generated, but I find that when I open the saved file the link color is the same for both the good link and the bad link, when as I understand from your code above one shold be red and the other should be blue. What am I doing wrong? – Luinithil Sep 08 '12 at 15:13
  • This file contains some optional PHP code which shows you how to pass PHP variables' values to into javascript. The PHP code needs to be executed by PHP, then served by a webserver. Opening it directly with your browser will not execute the PHP part. You can easily eliminate the PHP part and code the link arrays directly in javascript. If you work on Windows install WAMP, if on Linux install LAMP, if on MacOSX MAMP (A=Apache webserver, M=MySQL, P=PHP). Execute the file in /www folder of the *AMP instalation using http://localhost/whateverfilenameyouchose.php – oxygen Sep 08 '12 at 17:08