9
<textarea name="test">
  http://google.com/
  https://google.com/
  www.google.com/
  [url=http://google.com/]google.com[/url]
  text
</textarea>

My current attempt at checking if there is a URL in the textarea.

if ($('textarea[name="test"]').val().indexOf('[url') >= 0 ||
    $('textarea[name="test"]').val().match(/^http([s]?):\/\/.*/) ||
    $('textarea[name="test"]').val().match(/^www.[0-9a-zA-Z',-]./)) {

This doesn't seem to work completely for checking any of the URLs above - I'm wondering how it can be optimized. It seems very sloppy and hacked together at the moment and hopefully someone can shed some insight.

My current attempt at removing URLs from the textarea:

var value = $('textarea[name="test"]').val();
    value = value.replace(/\[\/?url([^\]]+)?\]/g, '');
$('textarea[name="test"]').val(value);

Right now, it will output:

<textarea>
  http://google.com/
  https://google.com/
  www.google.com/
  google.com
  text
</textarea>

What I'd like my output to be:

<textarea>
  text
</textarea>
O P
  • 2,327
  • 10
  • 40
  • 73
  • possible duplicate of [Regex to match URL](http://stackoverflow.com/questions/1141848/regex-to-match-url) – JJJ Feb 23 '13 at 19:34
  • Don't forget to do this check on the serverside too. Javascript can be disabled on the client. – ZippyV Feb 23 '13 at 19:37
  • @ZippyV I'm just trying to better my client side scripting; PHP checks are already in effect. – O P Feb 23 '13 at 19:38
  • 1
    Combined your regexp with another (from [here](http://stackoverflow.com/questions/1141848/regex-to-match-url)): http://jsfiddle.net/fw9MH/ – dfsq Feb 23 '13 at 19:42

4 Answers4

6

Try (Corrected and improved after comments):

value = value.replace(/^(\[url=)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?\S+$\s*/mg, '');

Peeling the expression from end to start:

  • An address might have two or three 'parts', besides the scheme
  • An address might start with www or not
  • It my be preceeded by http:// or https://
  • It may be enclosed inside [url=...]...[/url]

This expression does not enforce the full correct syntax, that is a much tougher regex to write.
A few improvements you might want:

1.Awareness of spaces

value = value.replace(/^\s*(\[\s*url\s*=\s*)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?\S+\s*$\s*/mg, '');

2.Enforce no dots on the last part

value = value.replace(/^(\[url=)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?[^.\s]+$\s*/mg, '');
ilomambo
  • 8,290
  • 12
  • 57
  • 106
  • Unfortunately javascript doesn't support the inline `(?m)` multiline mode syntax. – MikeM Feb 23 '13 at 19:51
  • @MikeM the `(?m)` was for the sake of the newline char at the end. I guess that if you leave it out, then you should at least see the target lines replaced, but not the newline. Unless jquery is multiline by default, in which case it should work fine. – ilomambo Feb 23 '13 at 20:02
  • For multiline mode add the `m` at the end of the regex, as with `g`. At the moment your code throws an 'invalid quantifier' error because of the `(?m`. – MikeM Feb 23 '13 at 20:07
  • Your new regex will delete any line that contains two separated dots not at the start or end of it. And because your regex insists that `.` appears at least twice it won't match e.g. `http://google.com/`. If you add a `?` after the second `\.` it will then match such urls, although the first problem will remain. – MikeM Feb 24 '13 at 10:51
  • Okay, but it would give exactly the same result as using `/^\S+\.\S+$\s*/mg`, so it will delete any line of non-space characters that includes a dot not at the start or end. And it won't replace any url that has spaces after it on the same line. (I have only looked at your top regex.) – MikeM Feb 24 '13 at 12:31
  • @MikeM Actually I think you inadvertently gave the most practical solution for the OP. `/^\s*\S+\.\S+\s*$/mg` is short consice and probably matches most if not all of the possible targets. – ilomambo Feb 24 '13 at 14:21
  • This isn't that great. It will detect o.o as a url. It will be quite common for someone to forget a space after a period before starting a new sentence. – Paulie Jul 30 '14 at 03:09
2

Regarding your attempt at checking if there is a URL in the textarea.

if ($('textarea[name="test"]').val().indexOf('[url') >= 0 ||
    $('textarea[name="test"]').val().match(/^http([s]?):\/\/.*/) ||
    $('textarea[name="test"]').val().match(/^www.[0-9a-zA-Z',-]./)) {

Firstly, rather than getting the textarea value three times using multiple function calls it would better to store it in a variable before the checking, i.e.

var value = $('textarea[name="test"]').val();

The /^http([s]?):\/\/.*/, because of the ^ will only match if the "http://..." is found right at the beginning of the textarea value. The same applies to the ^www.. Adding the multiline flag m to the end of the regex would make ^ match the start of each line, rather than just the start of the string.

The .* in /^http([s]?):\/\/.*/ serves no purpose as it matches zero or more characters. The ([s]?) is better as s?.

In /^www.[0-9a-zA-Z',-]./, the . needs to be escaped to match a literal . if that is your intention, i.e. \., and I assume you mean to match more than one of the characters in the character class so you need to follow it with +.

It is more efficient to use the RegExp test method rather than match when the actual matches are not required, so, combining the above, you could have

if ( /^(\[url|https?:\/\/|www\.)/m.test( value ) ) {

There is little point in the check anyway if you are only using it to decide whether you need to call replace, because the check is implicit in the replace call itself

Using the simple criteria that strings of non-space characters at the start of a line and beginning with http[s]://, [url or www., should be removed, you could use

value = value.replace( /^(?:https?:\/\/|\[url|www\.)\S+\s*/gm, '' );

If the urls can appear anywhere you could use \b, meaning word boundary, instead of ^, and remove the m flag.

value = value.replace( /(?:\bhttps?:\/\/|\bwww\.|\[url)\S+\s*/g, '' );

It would be a waste of effort to try to offer a better regex solution without precise details of what forms of url may appear in the textarea, where they may appear and what characters may adjoin them.

If any valid url can appear anywhere in the textarea and be surrounded by any other characters than there is no watertight solution.

MikeM
  • 13,156
  • 2
  • 34
  • 47
  • You don't deal, conveniently, with addresess of the form `mail.google.com`, so if those are possible addresses, your regex won't work. – ilomambo Feb 24 '13 at 14:22
1

The below JQuery code will do the job

<script>
// disable links in textarea and add class linkdisable in textarea
jQuery('.linkdisable').focusout(function(e){
  var message = jQuery('.linkdisable').val();
   if(/(http|https|ftp):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/i.test($(this).val())){
      alert('Links Not Allowed');
      e.preventDefault();
    }
   else if (/^[a-zA-Z0-9\-\.]+\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)$/i.test($(this).val())) {
     alert('Links Not Allowed');
      e.preventDefault();
  }
});
</script>
Vidhyut Pandya
  • 1,605
  • 1
  • 14
  • 27
mayurdarji
  • 11
  • 1
0

try this also value = value.replace(/https?://[-A-Za-z0-9+&@#/%?=~_|$!:,.;]*/g, '');

Sarah Thomas
  • 241
  • 2
  • 3