2

I'm looping through some text with embedded literature references. Some of these are DOI numbers, and I need to linkify them.

Example text:

<div>Interesting article here:  doi:10.1203/00006450-199305000-00005</div>

What I've tried so far:

$html = preg_replace("\b(10[.][0-9]{4,}(?:[.][0-9]+)*/(?:(?![\"&\'<>])[[:graph:]])+)\b", "<a href='https://doi.org/\\0' target='_new'>doi:\\0</a>",$html);

This returns an empty string.

I'm expecting:

<div>Interesting article here:  <a href='https://doi.org/10.1203/00006450-199305000-00005' target='_new'>doi:10.1203/00006450-199305000-00005</a></div>

Where am I going wrong?

edit 2018-01-30: updated DOI resolver per Katrin's answer below.

a coder
  • 7,530
  • 20
  • 84
  • 131

3 Answers3

1

CrossRef has a recommendation, that they tested successfully on 99.3% of DOIs:

/^10.\d{4,9}/[-._;()/:A-Z0-9]+$/i

Also, the new recommended resolver resides at https://doi.org/.

Katrin Leinweber
  • 1,316
  • 13
  • 33
1

I changed recommended pattern from CrossRef recommendation pattern, then I use this function for my Laravel project:

function is_valid_doi($doi)
{
    return preg_match('/^((http(s)?:\/\/)?(dx.)?doi.org\/)?10.\d{4,9}\/[-._;()\/:A-Z\d]+$/i', $doi);
}

hope to help you.

PouriaDiesel
  • 660
  • 8
  • 11
0

Using Regular Expression Test Tool I found an expression that works for my example text:

$pattern        = '(10[.][0-9]{4,}[^\s"/<>]*/[^\s"<>]+)';
$replacement    = "<a href='http://dx.doi.org/$0' target='1'>doi:$0</a>";
$html = preg_replace($pattern, $replacement, $html);

hth

Community
  • 1
  • 1
a coder
  • 7,530
  • 20
  • 84
  • 131