0

What's a good way for removing data URIs from a block of text with a regular expression? My application has to remove them from email body texts because Amazon SES (Simple Email Service) rejects any email with data URIs. They should also be removed for security purposes.

Example data URI:

data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7

It could be surrounded by quotes as in HTML:

<img src="data:image/gif;base64,R0lGODlhEAAQA...." />

or it could be in a CSS background:

background: url(data:image/gif;base64,R0lGODlhEAAQA....);
Michael Butler
  • 6,079
  • 3
  • 38
  • 46
  • 1
    You'll probably find [this question](http://stackoverflow.com/questions/8106038/regex-for-detecting-base64-encoded-strings) (as well as the related question linked in a comment) helpful. And of course [preg_replace](http://us2.php.net/preg_replace) – Patrick Q Apr 21 '14 at 23:55

1 Answers1

0

This should do the trick:

$regex = '~data:[^;]+;[A-Za-z0-9]+,[^")\'\s]+~';
$body = preg_replace($regex, '', $body);
Michael Butler
  • 6,079
  • 3
  • 38
  • 46