2

Okay gang, here's my conundrum:

I'm looking to match a string using vanilla JavaScript's test(), a function of the RegExp prototype, to test an input variable inp:

/{CONDITION}/.test(inp)

The string must meet the following conditions:

  • It can be one or two characters long. Easy enough.

    /^*{1,2}$/.test(inp)
    
  • It's gotta be case-insensitive. No problem.

    /^*{1,2}$/i.test(inp)
    
  • If only a single character, it must be comprised of only the characters [tmbrcl]

    /^[tmblcr]{1}$/i.test(inp)
    
  • If two character long, the first character must be [tmb] OR [lcr], and the second must be of whichever set the first isn't. Okay:

    /^([tmblcr]{1})$|^([tmb]{1}[lcr]{1})|^([lcr]{1}[tmb]{1})$/i.test(inp)
    

Examples:

't'   // Good
'B'   // Good
'Rc'  // Good
'bl'  // Good
'tb'  // bad
'mm'  // Bad
'cC'  // Bad
'BB'  // Bad
'Bob' // Bad
'5'   // Bad
'Ċ'   // Still Bad
'ß'   // Suspiciously Bad
''  // Now you're just screwing with me
'上'  // You know what? I don't care if this fails gracefully or not. ^&%* you.

My objective here is to parse out user input that will indicate the vertical and horizontal position ('T'/'M'/'B' representing 'Top'/'Middle'/'Bottom' and 'L'/'C'/'R' representing 'Left'/'Center'/'Right', respectively). The user should be allowed to pass in any permutation of the two groupings, in any case, in any order (or just a single one, in which case the other is inferred as the default).

I'm not fixated on using regex, but it seemed way clumsy to do something like (or equally obtuse):

  let errd  = false;
      set1  = 'TMB',
      set2  = 'LCR',
      sets  = set1 + set2;
  if(inp.length === 1 && sets.indexOf(inp) === -1) errd = true;
  else if(inp.length === 2){
      let inpArr = inp.split('');
      errd = (set1.indexOf(inpArr[0]) === set1.indexOf(inpArr[1]) === -1 || set2.indexOf(inpArr[0]) === set2.indexOf(inpArr[1]) === -1);
  }else errd = true;

So my question is: Is there really no more graceful way to handle this than simply spitting out every permutation of the desired outcome?

/^[SINGLE (S)]$|^[CASE A/B]$|^[CASE B/A]$/i

I mean, what if there were THREE,

/^[S]$|^[AB]$|^[AC]$|^[BC]$|^[BA]$|^[CA]$|^[CB]$|^[ABC]$|^[ACB]$|^[BAC]$|^[BCA]$|^[CAB]$|^[CBA]$/i

or (gods help me), FOUR characters with a similar set of restrictions? I'm relatively new at RegEx, and I'm wondering if I'm missing a core principle here. I HAVE a working solution (the "/^[S]|[AB]|[BA]$/" version), but is this actually the RIGHT one?

EDIT

Thanks for the stellar, comprehensive answer, Sweeper!

(Here's the working code in context, in case it'll help someone else later):

orient: function(objQS, rPos='TL', offsetWidth=0, offsetHeight=0)  {
    try{

        // objQS accepts a QuerySelector string or an HTMLElement
        let obj = (typeof(objQS) === 'string') ? document.querySelector(objQS) : objQS;
        console.log('obj', obj, obj.getBoundingClientRect())
        if(null == obj || typeof(obj) !== 'object'){ throw('Invalid Target!'); }
        let objBRC = obj.getBoundingClientRect();

        // rPos accepts TL, T/TC, TR, ML, M/C/MC, MR, BL, B/BC, BR (case- and order-insensitive)
        if(!/^(?:[tmbrcl]|[tmb][rcl]|[rcl][tmb])$/i.test(rPos)){ throw('Invalid orientation specified!'); }

        // Accomodate single-character entry of 'm' or 'c', both taken to mean 'mc' ('m'iddle-'c'enter)
        if(/^[mc]$/i.test(rPos)) { rPos = 'mc'; } 

        // Set default orientation to top-left (tl/lt), meaning we have nothing to do for 't'op or 'l'eft
        let osT = objBRC.y + offsetHeight,                       // Note we add the user-set offsets to our bases
            osL = objBRC.x + offsetWidth;                        // so they carry though to the other options.
        if(/m/i.test(rPos))      { osT += (objBRC.height / 2); } // Adjust vertically for 'm'iddle (top + height/2)
        if(/b/i.test(rPos))      { osT += objBRC.height; }       // Adjust vertically for 'b'ottom (top + height)
        if(/c/i.test(rPos))      { osL += (objBRC.width / 2); }  // Adjust horizontally for 'c'enter (left + width/2)
        if(/r/i.test(rPos))      { osL += objBRC.width; }        // Adjust horizontally for 'r'ight (left + width)

        objBRC.offsetTop  = osT;
        objBRC.offsetLeft = osL;
        this.place(osL, osT);
        console.log('return', 'objBRC:', objBRC)
        return objBRC;
    }catch(e){
        console.group('ERROR DETAILS (Error in callout.orient)');
        console.error('Error details:\n  - ', e);
        console.groupEnd();
        return false;
    }
}
ZenAtWork
  • 91
  • 1
  • 8

1 Answers1

2

Your regex can be greatly shortened to this:

/^(?:[tmbrcl]|[tmb][rcl]|[rcl][tmb])$/i

which I think is a good enough solution. It reads quite clearly:

Between the start and end of the string, there are three options:

  • one of [tmbrcl]
  • one of [tmb] then one of [rcl]
  • one of [rcl] then one of [tmb]

You don't actually need all those {1}s.

EDIT:

I didn't realise you are asking about cases with more sets. In that case, I think you should employ a different approach.

One way is this:

  1. Have one regex for each of the sets:

    var r1 = /[abc]/i // notice the missing ^ and $ anchors
    var r2 = /[def]/i
    var r3 = /[ghi]/i
    
  2. Put them all in an array

    var regexes = [r1, r2, r3]
    
  3. Loop through the array and count how many regexes match the string

  4. The number of regexes that match the string should be equal to the length of the string.

Note that this assumes that your sets do not intersect.

Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • I see. Use the non-capturing group as a sort of order of operations, evaluating the internal group first. Right on. But again, though, does that imply that, given the same scenario, but with 3 characters, across three potential groups (call them `abc`, `def`, `ghi`), and allowing for 1, 2, OR 3 characters (no repeats, one per group), I'm looking at /^(?:[abcdefghi]|[abc][def]|[abc][ghi]|[def][ghi]|[def][abc]|[ghi][def]|[ghi][abc]|[abc][def][ghi]|[abc][ghi][def]|[def][abc][ghi]|[def][ghi][abc]|[ghi][abc][def]|[ghi][def][abc])$/i – ZenAtWork Jan 01 '19 at 01:32
  • @ZenAtWork I have edited in another solution. Didn't realise you were asking about more sets. – Sweeper Jan 01 '19 at 01:33
  • 1
    I'm savvy to it. Okay, so that both does answer my question, optimizes my initial clumsy attempt, and clearly informs me that RegEx is not necessarily the correct tool exclusively when dealing with a larger subset (see: inflating bike tire with a banana). Thanks for the prompt response; answer selected. – ZenAtWork Jan 01 '19 at 01:36