0

I'm using a PDF to HTML plugin for wordpress, which generates a canvas and a text layer for each page. I need to make any URL's click-able within the PDF, so I'm trying to write a script that will detect and URLS (using .com as the identifier, I may add to this later).

all I need help with at this stage is capturing the full URL as a variable, not just the div that contains the ".com"

I currently have this script, however I need to replace "http://test.com" with the URL that is found.

$('div:contains(".com")').each(function () {
    $(this).addClass('contains-url');
    console.log('found div containing a URL');

    $(this).attr('href','http://test.com');

});


$('.contains-url').click(function(){
  console.log('clicked - this will open in a new tab');
    window.open($(this).attr("href"), '_blank');
});
div{
border: 1px solid #000; padding: 5px; width: 100%; margin-bottom: 10px;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div>
This div does not contain a URL
</div>

<div>
This div contains URL http://loremipsum.com
</div>

Any help is much appreciated

1 Answers1

1

You can use findUrls($(this).text()) from this link

$('div:contains(".com")').each(function() {
  $(this).addClass('contains-url');
  console.log('found div containing a URL');
  var url = findUrls($(this).text())
  $(this).attr('href', url);
console.log($(this).attr('href'))
});


$('.contains-url').click(function() {
  console.log('clicked - this will open in a new tab');
  window.open($(this).attr("href"), '_blank');
});

function findUrls(text) {
  var source = (text || '').toString();
  var urlArray = [];
  var url;
  var matchArray;

  // Regular expression to find FTP, HTTP(S) and email URLs.
  var regexToken = /(((ftp|https?):\/\/)[\-\w@:%_\+.~#?,&\/\/=]+)|((mailto:)?[_.\w-]+@([\w][\w\-]+\.)+[a-zA-Z]{2,3})/g;

  // Iterate through any URLs in the text.
  while ((matchArray = regexToken.exec(source)) !== null) {
    var token = matchArray[0];
    urlArray.push(token);
  }

  return urlArray;
}
div {
  border: 1px solid #000;
  padding: 5px;
  width: 100%;
  margin-bottom: 10px;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div>
  This div does not contain a URL
</div>

<div>
  This div contains URL http://loremipsum.com
</div>
Community
  • 1
  • 1
Carsten Løvbo Andersen
  • 26,637
  • 10
  • 47
  • 77
  • Thank you for the quick reply, I'll try this out but it looks like this will do what I need so I'll mark your answer as correct :) – Michael Dolphin Apr 11 '17 at 06:42
  • Hey Carsten, it works however it's currently only saving email addresses. It won't save normal URL's. See screenshot of console: http://pasteboard.co/2VhoXsTJZ.jpg The red areas are the blank spaces generated by normal URL's – Michael Dolphin Apr 11 '17 at 07:56
  • I assume it's because it's searching for http? Can it be made to just search for .com? Not all URL's contain http, most contain www but not all – Michael Dolphin Apr 11 '17 at 07:57
  • okay, I've adapted it to search for www as well as ftp and http. Thank you again :) works great! – Michael Dolphin Apr 11 '17 at 08:19