0

I would like to detect url's that are entered in a text input. I have the following code which prepends http:// to the beginning of what has been entered:

var input = $(this);
var val = input.val();
if (val && !val.match(/^http([s]?):\/\/.*/)) {
    input.val('http://' + val);
}

How would I go about adapting this to only append the http:// if it contains a string followed by a tld? At the moment if I enter a string for example:

Hello. This is a test

the http:// will get appended to hello, even though it's not a url. Any help would be greatly appreciated.

Patrick Oscity
  • 53,604
  • 17
  • 144
  • 168
danyo
  • 5,686
  • 20
  • 59
  • 119

3 Answers3

0

This simple function works for me. We don't care about the real existence of a TLD domain to gain speed, rather we check the syntax like example.com.

Sorry, I've forgotten that VBA trim() is not intrinsic function in js, so:

// Removes leading whitespaces
function LTrim(value)
{
    var re = /\s*((\S+\s*)*)/;
    return value.replace(re, "$1");
}

// Removes ending whitespaces
function RTrim(value)
{
    var re = /((\s*\S+)*)\s*/;
    return value.replace(re, "$1");
}

// Removes leading and ending whitespaces
function trim(value)
{
    return LTrim(RTrim(value));
}

function hasDomainTld(strAddress)
{ 
  var strUrlNow = trim(strAddress);
  if(strUrlNow.match(/[,\s]/))
  {
    return false;
  }
  var i, regex = new RegExp(); 
  regex.compile("[A-Za-z0-9\-_]+\\.[A-Za-z0-9\-_]+$"); 
  i = regex.test(strUrlNow);
  regex = null;
  return i;
} 

So your code, $(this) is window object, so I pass the objInput through an argument, using classical js instead of jQuery:

function checkIt(objInput)
{
  var val = objInput.value;
  if(val.match(/http:/i)) {
    return false;
  }
  else if (hasDomainTld(val)) {
    objInput.value = 'http://' + val;
  }
}

Please test yourself: http://jsfiddle.net/SDUkZ/8/

jacouh
  • 8,473
  • 5
  • 32
  • 43
0

You need to narrow down your requirements first as URL detection with regular expressions can be very tricky. These are just a few situations where your parser can fail:

  • IDNs (госуслуги.рф)
  • Punycode cases (xn--blah)
  • New TLD being registered (.amazon)
  • SEO-friendly URLs (domain.com/Everything you need to know about RegEx.aspx)

We recently faced a similar problem and what we ended up doing was a simple check whether the URL starts with either http://, https://, or ftp:// and prepending with http:// if it doesn't start with any of the mentioned schemes. Here's the implementation in TypeScript:

public static EnsureAbsoluteUri(uri: string): string {
  var ret = uri || '', m = null, i = -1;
  var validSchemes = ko.utils.arrayMap(['http', 'https', 'ftp'], (i) => { return i + '://' });

  if (ret && ret.length) {
    m = ret.match(/[a-z]+:\/\//gi);

    /* Checking against a list of valid schemes and prepending with "http://" if check fails. */
    if (m == null || !m.length || (i = $.inArray(m[0].toLowerCase(), validSchemes)) < 0 ||
      (i >= 0 && ret.toLowerCase().indexOf(validSchemes[i]) != 0)) {

      ret = 'http://' + ret;
    }
  }

  return ret;
}

As you can see, we're not trying to be smart here as we can't predict every possible URL form. Furthermore, this method is usually executed against field values we know are meant to be URLs so the change of misdetection is minimal.

Hope this helps.

volpav
  • 5,090
  • 19
  • 27
0

The best solution i have found is to use the following regex:

/\.[a-zA-Z]{2,3}/

This detects the . after the url, and characters for the extension with a limit of 2/3 characters.

Does this seem ok for basic validation? Please let me know if you see any problems that could arise.

I know that it will detect email address's but this wont matter in this instance.

danyo
  • 5,686
  • 20
  • 59
  • 119