0

I'm trying to process some html and replace all img tags src with base64. I've written the function below to convert the image and return it in base64. What I need help with is the following:

I need to use str_replace, preg_replace or some sort regex to scan some html and replace all of the "src" with the base64 representation of the image. The html is stored as a variable and not as an actual html document.For example, if I have some html like:

$htmlSample =  "<div>Some text, yada yada and now and image <img src='image1.png' /></div>"

I need to scan it and replace src='image.png' with the base64 equivalent, something like src="data:image/png;base64,/9j/4WvuRXhpZgAASUkqAAgAAAAIAA8BAgASAAAAbgAABAgAK" ---(this is not actual base64 just some filler text). The function will need to be able to do this for multiple images in the html. If you can point me in the right direction I would be very greatful. Thanks guys!

function convertImage($file)
{


    if($fp = fopen($file,"rb", 0))
    {
       $picture = fread($fp,filesize($file));
       fclose($fp);
       $base64 = base64_encode($picture);
       $tag = '<img ' . "" .
          'src="data:image/png;base64,' . $base64 .
          '"  />';
       return $tag;
    }

}
ajodom10
  • 141
  • 1
  • 4
  • 12
  • 2
    http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags#answer-1732454 Regular expression is not the right tool. Just said. – KingCrunch Mar 14 '13 at 00:03
  • 1
    [getElementsByTagName()](http://www.php.net/manual/en/domdocument.getelementsbytagname.php), [getAttribute()](http://www.php.net/manual/en/domelement.getattribute.php), [setAttribute()](http://www.php.net/manual/en/domelement.setattribute.php). – Wrikken Mar 14 '13 at 00:06
  • possible duplicate of [How to parse and process HTML/XML with PHP?](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-xml-with-php) – Wrikken Mar 14 '13 at 00:08
  • Contrary to the dated meme, a regex is sufficient for simple string searches as that. That task in particular has been covered numerous times, [Help with regex replace in php](http://stackoverflow.com/q/1175596) – mario Mar 14 '13 at 00:13
  • It's not and actual file, it's a string with some text and I need to find some text and replace it after running a function on it. Does it really matter if the string contains html? – ajodom10 Mar 14 '13 at 00:37
  • I think this is going to help... http://stackoverflow.com/questions/1196570/using-regular-expressions-to-extract-the-first-image-source-from-html-codes I'm looking into it now and will post my results here. – ajodom10 Mar 14 '13 at 00:46

2 Answers2

1

Look at a DOM Manipulator such as SimpleDOM. this will let you parse html documents in a more object orientated way instead of messy regular expressions as the libraries will more likely than not handle situations that you may not think of.

Adam
  • 88
  • 6
  • The html is stored as a variable and not as an actual html document. Will SimpleDOM still work with something like that? – ajodom10 Mar 14 '13 at 00:28
  • Yes SimpleDOM has multiple methods for loading a page either via a variable, URL or file. I just found the link: http://simplehtmldom.sourceforge.net. The docs are quite straightforward too. – Adam Mar 14 '13 at 01:14
0

As Adam suggested, I was able to get this done using SimpleDOM (link: simplehtmldom.sourceforge.net).

require_once('simple_html_dom.php');
$html = "This is some test code <img width='50' src='img/paddock1.jpg' /> And this is some additional text and an image: <img src='img/paddock2.jpg' />";

//uses function from simple_html_dom.php to make html parsable
$doc = str_get_html($html);

//finds each image in html and converts
foreach ($doc->find('img[src]') as $img) 
{

    //get src of image and assign to $src
    $src = $img->src;

    $imageBase = convertImage($src);

    $img->src = $imageBase;


}

$html = (string) $doc;

echo $html;

function convertImage($file)
{

    //finds file based on $src name from above and runs code if file exists
    if($fp = fopen($file,"rb", 0))
    {
       $picture = fread($fp,filesize($file));
       fclose($fp);
       //converts image file to base64
        $base64 = base64_encode($picture);

       //returns nessary data: + base64 code to $imageBase above to be inserted into html>img>src
       return 'data:image/png;base64,' . $base64;
    }
}
ajodom10
  • 141
  • 1
  • 4
  • 12