1

I've made two JavaScript functions to dynamically create RegExps groups that match numbers lower than or greater than the number that is sent in the parameter. The purpose of these functions is to do something like this or this but dynamically, I need it for an App for building RegExps, this code made a group that match a particular group of numbers, later you could use the returned group to complete your final RegExp.

Here is the function to create a RegExp to find patterns greater than a desired value:

//Find greater than numbers
function getGreaterUintRegEx(n) {
  var s = String(n);
  var t = s.length,
    a = [];
  for (var i = 1; i < t + 1; i++) {
    switch (s.charAt(t - i)) {
      case "9":
        a.push((Number(s.slice(0, t - i)) + 1) + "0" + (new Array(i)).join("\\d"));
        break;
      case "8":
        a.push(s.slice(0, t - i) + "9" + (new Array(i)).join("\\d"));
        break;
      default:
        a.push(s.slice(0, t - i) + "[" + (Number(s.charAt(t - i)) + 1) + "-9]" + (new Array(i)).join("\\d"));
    }
  }
  a.push("\\d{" + (t + 1) + ",}");
  a = a.filter(function(s, i) {
    return a.indexOf(s) == i;
  });
  return "(" + a.join("|") + ")";
}

Example of use:

var regstr = getGreaterUintRegEx(124);
// (12[5-9]|1[3-9]\d|[2-9]\d\d|\d{4,})

var regstr = getGreaterUintRegEx(500);
// (50[1-9]|5[1-9]\d|[6-9]\d\d|\d{4,})

And here is the function to create a RegExp to find patterns lower than a desired value:

//Find lower than numbers
function getLowerUintRegEx(n) {
  if (n == 0) return false;
  if (n == 1) return "(0)";
  if (n > 0 && n < 10) return "[0-" + (n - 1) + "]";
  var s = String(n);
  var t = s.length,
    a = [];
  for (var i = 1; i < t + 1; i++) {
    switch (s.charAt(t - i)) {
      case "0":
        a.push(((s.slice(0, t - i) == "1") ? "" : (Number(s.slice(0, t - i)) - 1)) + "9" + (new Array(i)).join("\\d"));
        break;
      case "1":
        a.push("[1-9]" + (new Array(i - 1)).join("\\d"));
        break;
      default:
        a.push(s.slice(0, t - i) + "[0-" + (Number(s.charAt(t - i)) - 1) + "]" + (new Array(i)).join("\\d"));
    }
  }
  if (t - 1 > 1) a.push("\\d{1," + (t - 1) + "}");
  a.push("0");
  a = a.filter(function(s, i) {
    return a.indexOf(s) == i;
  });
  return "(" + a.join("|") + ")";
}

Example of use:

var regstr = getLowerUintRegEx(498);
// (49[0-7]|4[0-8]\d|[0-3]\d\d|\d{1,2}|0)

var regstr = getLowerUintRegEx(125);
// (12[0-4]|1[0-1]\d|[1-9]\d|\d{1,2}|0)

I want to make these functions more simply and less slow. It takes more than a second with big numbers. Is there any other easier method? Somebody knows a robust algorithm with less steps?

Community
  • 1
  • 1
ElChiniNet
  • 2,778
  • 2
  • 19
  • 27
  • 1
    Any reason to use a RegExp rather than simply splitting the string and converting to numbers? – Xotic750 Jan 08 '16 at 00:35
  • The `String` could be that or any other text (that string is only an example), the `RegExp` finds patterns in any type of text. – ElChiniNet Jan 08 '16 at 00:39
  • So the string could be `d131dd02c5e6eec4 693d9a0698aff95c 2fcab58712467eab 4004583eb8fb7f89` and what would you expect as a result from say `getLowerUintRegEx(500)`? – Xotic750 Jan 08 '16 at 00:43
  • @Xotic750 The function always return the same value when you pass 500 as parameter, it only creates a string like this (499|49\d|[0-4]\d\d|\d{1,2}|0). This `String` match with numbers from 0-499. The `String` that you posted could be an example, depending of the situation and the matches that I want to extract, the approach will be one or another. For example if I want to extract all entire numbers (without numbers at sides) lower than 500 in that `String` I could do this: https://jsfiddle.net/elchininet/dgsu9qyb/ – ElChiniNet Jan 08 '16 at 01:19
  • And what would you expect from number strings with leading zero `06` for example, or floating point number strings like `54.1` or `0.21`. I am very uncertain as to your specification. – Xotic750 Jan 08 '16 at 01:37
  • Look at my post update. I only need to create a `RegExp` dynamically. Do not pay attention to the example strings. – ElChiniNet Jan 08 '16 at 01:40
  • @Timeout I need that if you send to the function 45 it returns something like "(4[0-4]|[1-3]\d|\d)" and if you send 123 it returns something like "(12[0-2]|1[0-1]\d|\d{1})". Only that, I need to simplify my functions (my question), that's all. – ElChiniNet Jan 08 '16 at 01:54
  • Is there an actual problem with the code that you have now (is it not working?), or are you just looking for someone to review it? – Xotic750 Jan 08 '16 at 01:55
  • @Xotic750, the code works well but my question is if there are another better method to do this. Maybe some user knows a better method or maybe can simplify my code. – ElChiniNet Jan 08 '16 at 02:05
  • It seems that your question is better suited to http://codereview.stackexchange.com/ rather than here. – Xotic750 Jan 08 '16 at 02:06
  • I'm voting to close this question as off-topic because it appears to be working fine and the OP is actually requesting a review, which seems more appropriate at http://codereview.stackexchange.com/ – Xotic750 Jan 08 '16 at 02:09
  • @ Xotic750 I don't want a review mate. I don't know why you are angry? I only need that someone tell me if there are another method to do this thing because my method is long and slow. – ElChiniNet Jan 08 '16 at 02:13
  • I'm not angry, I have simply asked for an explanation of what you are trying to do, and what the problem is with your current code. From your replies it seems that you have working code and it appears that you want someone to review your code and suggest improvements/optimisations? For me that spells off-topic, but there is a sister site where these types of questions are asked. – Xotic750 Jan 08 '16 at 02:19
  • You shouldn't be using regular expressions here, your code will become much faster and saner if your write a proper parser. That said, you'll get a big speedup just by removing all unnecessary data structures (each call creates several Arrays and a Function, none are needed) and replacing `filter` with manual iteration. – twhb Jan 08 '16 at 08:58
  • Hi @twhb, I do not use regular expressions in the functions. How can I replace the `new Array`? There is other method more faster to doing an `str_pad`?. I will take your advice and will replace the filter to see the performance, maybe I'll gain in speed. Thanks a lot! – ElChiniNet Jan 08 '16 at 09:09
  • I mean, you should not use regular expressions to solve this problem, they're not powerful enough to make this reasonable. I've been playing around with this (longer than I should have), [here](https://gist.github.com/twhb/e6bb82c524d3cdd9d99a) is something that should be faster. Also maybe try replacing the template strings with the old `'' + ''`, they're a relatively new feature and browsers sometimes aren't so fast on their first go at implementing new features. – twhb Jan 08 '16 at 09:38
  • @twhb I need a `RegExp` because is for a teaching/testing `RegExp` App ;). I'll follow your advices and will update the post with the results. Thanks for all. Regards. – ElChiniNet Jan 08 '16 at 09:46
  • Hi @twhb, I tested it and I've gained on time execution time. There is not great differences between the use of the old concatenation method and the new ES6 template strings, but I prefer the use of the old style until the string templates become more standard. Please, make a answer with your whole explanation and I will give you all the credit. Thanks for your effort. ;) [jsfiddle](https://jsfiddle.net/elchininet/10w2a0fa/) – ElChiniNet Jan 08 '16 at 11:51

3 Answers3

1

This is not the way that I would solve the problem, but you are not looking for a better solution but actually want a review of your code, suggestions and optimisations that give the same functionality as your original (which is working code).

Anyway, below is a suggestion. The code is more readable, I have no intention of testing its performance.

var reduceRight = Function.prototype.call.bind(Array.prototype.reduceRight);

//Find greater than numbers
function getGreaterUintRegEx(n) {
  var s = String(n);
  var t = s.length - 1;
  var a = reduceRight(s, function(acc, v, i) {
    var x = s.slice(0, i);
    if (v === '9') {
      x = Number(x) + 1 + '0';
    } else if (v === '8') {
      x += '9';
    } else {
      x += '[' + (Number(v) + 1) + '-9]';
    }
    acc.push(x + '\\d'.repeat(t - i));
    return acc;
  }, []);
  a.push('\\d{' + (t + 2) + ',}');
  return '(' + a.join('|') + ')';
}

//Find greater than numbers: original
function getGreaterUintRegEx1(n) {
  var s = String(n);
  var t = s.length,
    a = [];
  for (var i = 1; i < t + 1; i++) {
    switch (s.charAt(t - i)) {
      case "9":
        a.push((Number(s.slice(0, t - i)) + 1) + "0" + (new Array(i)).join("\\d"));
        break;
      case "8":
        a.push(s.slice(0, t - i) + "9" + (new Array(i)).join("\\d"));
        break;
      default:
        a.push(s.slice(0, t - i) + "[" + (Number(s.charAt(t - i)) + 1) + "-9]" + (new Array(i)).join("\\d"));
    }
  }
  a.push("\\d{" + (t + 1) + ",}");
  a = a.filter(function(s, i) {
    return a.indexOf(s) == i;
  });
  return "(" + a.join("|") + ")";
}

var out = document.getElementById('out');
for (var i = 0; i < 100000; i += 1) {
  var suggested = getGreaterUintRegEx(i);
  var original = getGreaterUintRegEx1(i);
  if (suggested !== original) {
    var txt = suggested + '!==' + original;
    out.textContent = txt;
    throw new Error(txt);
  }
}
out.textContent = suggested + '\n' + original + '\nSame results';
<script src="https://cdnjs.cloudflare.com/ajax/libs/es5-shim/4.4.1/es5-shim.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/json3/3.3.2/json3.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/es6-shim/0.34.1/es6-shim.js"></script>
<pre id="out"></pre>
Xotic750
  • 22,914
  • 8
  • 57
  • 79
  • `This is not the way that I would solve the problem`. If you have a better solution, please give it to me (I've asked for it in my question). I've never said that I don't want another solution, I just only said that I need a `RegExp` group, not a replace function because my App needs a `RegExp`, if the result is a text representation of `RegExp` group that match a number lower o greater than another one any solution will be well received. Thanks for your effort and forgive me if I have been a little annoying to you, it has not been my intention. – ElChiniNet Jan 08 '16 at 13:27
  • As I have already said, and others have said and demonstrated, replacing, splitting etc are much better. But these are not suggestions that you are looking for. – Xotic750 Jan 08 '16 at 13:32
  • I think that you do not understand the functionality of my App yet. I don't want to replace anything, that is not my objective. I'll try to describe the function of my App: The user picks options and the App traduces the options and builds a `RegExp` `String` for the user. These functions are a little piece of my App, it `traduces` the part of matching a number lower or greater than a number picked by user. – ElChiniNet Jan 08 '16 at 13:42
  • I have reviewed `getGreaterUintRegEx` for you, and have given suggestions of how to improve the readability. It produces the same output as your original working function. You can apply these ideas in a similar fashion to `getLowerUintRegEx`. – Xotic750 Jan 08 '16 at 13:44
  • Thanks @Xotic750. The performance is a little lower than the twhb method, but the code is well structured and compact. I keep it in mind to build my final function. Thanks to you too. +1 [jsfiddle](https://jsfiddle.net/elchininet/jw5w5nax/) – ElChiniNet Jan 08 '16 at 14:11
  • You could retain readability and probably gain in performance by using pure ES6, but you mention that you'd prefer to stay ES5 or earlier. You can also gain in performance by using just ES3, but you will loose some of the readability gained by the ES5 methods. (`reduceRight` introduces a number of checks that are not needed in this situation, plus a couple of function invocations). Personally I'd take readability (and maintainability) over ES3 performance gains, unless the performance gains were that crucial and noticeable. – Xotic750 Jan 08 '16 at 14:50
  • Like you were running them x 100 times a second or something, but it sounds like they are run upon user selection. – Xotic750 Jan 08 '16 at 14:55
  • Thank you very much. I hope that ES6 are fully implemented soon, because is more structured and legible. Thanks for your comments and suggestion, I'll take in mind the part of legibility and maintainability over performance. ;) Thanks – ElChiniNet Jan 08 '16 at 14:56
  • A gain that you could make is changing the `if` layout, check the most common first and then least common. Instead of the first check being `if (v === '9') {` and so on, make it `if (v < '8') {` and then perform the other 2 checks. Readability will remain the same. – Xotic750 Jan 08 '16 at 15:02
  • The ordering of you regex will likely have greater impact, perhaps like checking `\d{4,}` first. – Xotic750 Jan 08 '16 at 15:16
  • Thanks, I'll try to invert the checking of `\d{X,}`, although I'll not use the generated `RegExps` they must be clear and compact too. With the conditions I prefer the [while over the if](http://jsperf.com/performance-of-assigning-variables-in-javascript). Thanks for all. – ElChiniNet Jan 08 '16 at 15:23
  • I think you mean `switch` over `if`. While this can be (often is) more performant it can make debugging more difficult, code less readable and less maintainable. An alternative is to use object literals. But again, all this is peanuts compared to your regex ordering. One persons view, you can search for more. https://toddmotto.com/deprecating-the-switch-statement-for-object-literals/ – Xotic750 Jan 08 '16 at 15:38
1

You'll get a big speedup just by removing unnecessary data structures (each call creates several Arrays and a Function, none are needed).

Here's a rewrite of just getGreaterUintRegEx:

function getGreaterUintRegEx(n) {
  var nStr = String(n);
  var len = nStr.length;
  var result = '(';
  var ds = '';
  var i;

  for (i = len - 1; i >= 0; i--) {
    switch (nStr.charAt(i)) {
      case '9': result += `${+nStr.slice(0, i) + 1}0${ds}|`; break;
      case '8': result += `${nStr.slice(0, i)}9${ds}|`; break;
      default:  result += `${nStr.slice(0, i)}[${+nStr.charAt(i) + 1}-9]${ds}|`;
    }
    ds += '\\d';
  }
  return `${result}\\d{${len + 1},})`;
}

I've used ES6 template strings just for readability. They're currently supported across evergreen browsers, but you'll want to swap them for the old '' + '' if you want to support IE11.

twhb
  • 4,294
  • 2
  • 20
  • 23
  • Thanks @twhb, with your advices and the Xotic750 ones, I'll improve my code. Thank for all to both of you. ;) – ElChiniNet Jan 08 '16 at 15:53
0

way to complicated, your approach. And way to limited.

function getInts(str){
    return String(str).match(/[+-]?\d+/g).map(Number);
}
function greaterThan(w){
    return function(v){ return v > w } 
}
function lowerThan(w){ //
    return function(v){ return v < w } 
}

getInts(someString).filter( greaterThan(473) )

or the more generic approach:

var is = (function(is){
    for(var key in is) is[key] = Function("w", "return function(v){return v " + is[key] + " w}");
    return is;
})({
    eq: "===",
    neq: "!==",
    gt: ">",
    lt: "<",
    gteq: ">=",
    lteq: "<="
});
is.typeof = function(w){ return function(v){ return typeof v === w }};

getInts(someString).filter( is.lt(92) );
Thomas
  • 3,513
  • 1
  • 13
  • 10
  • Read my post. I don't want to search a single number pattern (that is not my intention), I want to get a `RegExp` to show it to the user, then this `RegExp` group could be used to conform a more complex expression. I know that it is complicated, I'm searching a less complicated way to get a `RegExp` group from a desired number. – ElChiniNet Jan 08 '16 at 08:41