3

I have a form which contains a field "URL". The first portion needs to be filled by the user in the text box. The second portion is predefined and is shown to the right of text box.

For example, the user enters "test" in the text box. The second portion is predefined as ".example.com". So the total URL becomes "test.example.com".

I need a regular expression to validate the first portion. The following conditions are to be satisfied:

  1. Should not start or end with hyphen

  2. Should contain at least one letter

  3. Length should be between 4 and 21

    I have used the regex /^(?!:\/\/)([a-zA-Z0-9]+\.)?[a-zA-Z0-9][a-zA-Z0-9-]+\.[a-zA-Z]{2,6}?$/I which is mentioned in this thread:

Javascript regex to match fully qualified domain name, without protocol, optional subdomain

But this regex validates the whole URL (including the second portion). I need the validation for first portion only.

How do I modify the current regex to match the requirements?

Community
  • 1
  • 1
prajeesh
  • 2,202
  • 6
  • 34
  • 59
  • 1
    Rather than reinvent the wheel with your regex, why not just store the first portion as a separate variable, validate that part, and *then* join it to the second portion? – freginold Nov 01 '17 at 12:43
  • I am also thinking the same way. Is there any regex that validates the above 3 conditions? – prajeesh Nov 01 '17 at 12:45

3 Answers3

1

Break it down. Use the regexes:

  • /^-|-$/ (must not match)
  • /[a-z]/i (must match)
  • /^[a-z0-9-]{4,21}$/i (must match)

The advantage to doing this is that you can provide meaningful error messages to the user.

document.getElementById('subdomain').addEventListener("input",function() {
  var input = this.value;
  if( input.match(/^-|-$/)) this.setCustomValidity("Cannot start or end with a hyphen");
  else if( !input.match(/[a-z]/i)) this.setCustomValidity("Must contain at least one letter");
  else if( !input.match(/^[a-z0-9-]{4,21}$/)) this.setCustomValidity("Must be between 4 and 21 characters long");
  // add additional checks here, eg. /^[0-9]/ => Cannot start with a number
  else this.setCustomValidity("");
},true);
<form>
  <input type="text" id="subdomain" />.example.com
</form>
Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592
  • 1
    I upvoted because breaking down regex in single clauses is more constructive and organized. – Thielicious Nov 01 '17 at 14:20
  • I downvoted because of lots of useless garbage in the answer like form and browser elements which was not requested; also the OP's statement is that he / she wants to receive one regexp only and this answer clearly conflicts with author's intent. Actually weird to see answer like that from a guy with 234k – smnbbrv Nov 01 '17 at 16:39
  • @smnbbrv The form is required for the MVCE, since without it there is no validation. And secondly, have you heard of the X/Y problem? Sometimes, when a user asks for Y, they would be better served with a solution for X. That's why I, "a guy with 234k", have 234k. Just saying... – Niet the Dark Absol Nov 01 '17 at 19:23
0

This should do the trick:

/^(?=.*[A-Za-z])([0-9A-Za-z][0-9A-Za-z-]{2,19}[0-9A-Za-z])$/

Explanation:

  • (?=.*[A-z]) positive look ahead (at least one charachter from the set of [A-z])
  • ([0-9A-z] opens the pattern with no hyphen
  • [0-9A-z]) closes the pattern with no hyphen
  • [0-9A-z-]{2,19} - the rest that should be between first and last character (with hyphen, from 4 - 2 to 21 - 2)

Check:

var RE = /^(?=.*[A-z])([0-9A-z][0-9A-z-]{2,19}[0-9A-z])$/;

console.log(RE.test('-hyphen'), false);
console.log(RE.test('hyphen-'), false);
console.log(RE.test('lt4'), false);
console.log(RE.test('morethan21-morethan21-morethan21-morethan21-morethan21'), false);
console.log(RE.test('23123'), false);
console.log(RE.test('231-23'), false);
console.log(RE.test('[\]^_`'), false);
console.log(RE.test('H231-23'), true);
console.log(RE.test('2s31-23'), true);

Credits to this answer for a positive look ahead thingy (?=.*[A-z])

smnbbrv
  • 23,502
  • 9
  • 78
  • 109
0

I saw many questions and answers and I can not gave a good answer. I saw that some people did not follow the rules of domain registration like (You can not use '-' more than once in a row in domain name but you can use it many time. For instance: www.my--domain-name.com is false and www.my-domain-name.com is true. For example I saw this link What is a good regular expression to match a URL? but it has some problems. I used this code (https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,}) but I can not resolve my problem

I used this regular expression in asp.net and it is working so good for me:

^[(((ftp|http|https):\/\/)?(?:www\.|(?!www))[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+(-[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+)*\.[^\s^_]{2,}|www\.[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+(-[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+)*\.[^\s^_]{2,}|((ftp|http|https):\/\/)?(?:www\.|(?!www))[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+(-[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+)*\.[^\s^_]{2,}|www\.[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+(-[\u0061-\u007a\u0041-\u005a\u0030-\u0039 \u2000-\u200f\u2028-\u202f\u0621-\u0628\u062a-\u063a\u0641-\u0642\u0644-\u0648\u064e-\u0651\u0655\u067e\u0686\u0698\u06a9\u06af\u06be\u06cc\u06f0-\u06f9\u0629\u0643\u0649-\u064b\u064d\u06d5\u0660-\u0669\u005c]+)*\.[^\s^_]{2,}]{1,2083}$

Of course my regular expression has specialized for Persian language and if you want to use it for English language you must use this:

^[(((ftp|http|https):\/\/)?(?:www\.|(?!www))[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+(-[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+)*\.[^\s^_]{2,}|www\.[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+(-[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+)*\.[^\s^_]{2,}|((ftp|http|https):\/\/)?(?:www\.|(?!www))[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+(-[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+)*\.[^\s^_]{2,}|www\.[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+(-[\u0061\u002d\u007a\u0041\u002d\u005a\u0030\u002d\u0039]+)*\.[^\s^_]{2,}]{1,2083}$
Arash Yazdani
  • 302
  • 2
  • 12