The idea of the code below is to consider substrings of all lengths the original string can be divided into evenly, and to check whether they repeat across the original string. A simple method is to check all divisors of the length from 1 to the square root of the length. They are divisors if the division yields an integer, which is also a complementary divisor. E.g., for a string of length 100 the divisors are 1, 2, 4, 5, 10, and the complementary divisors are 100 (not useful as substring length because the substring would appear only once), 50, 25, 20 (and 10, which we already found).
function substr_repeats(str, sublen, subcount)
{
for (var c = 0; c < sublen; c++) {
var chr = str.charAt(c);
for (var s = 1; s < subcount; s++) {
if (chr != str.charAt(sublen * s + c)) {
return false;
}
}
}
return true;
}
function is_periodic(str)
{
var len = str.length;
if (len < 2) {
return false;
}
if (substr_repeats(str, 1, len)) {
return true;
}
var sqrt_len = Math.sqrt(len);
for (var n = 2; n <= sqrt_len; n++) { // n: candidate divisor
var m = len / n; // m: candidate complementary divisor
if (Math.floor(m) == m) {
if (substr_repeats(str, m, n) || n != m && substr_repeats(str, n, m)) {
return true;
}
}
}
return false;
}
Unfortunately there is no String method for comparing to a substring of another string in place (e.g., in C language that would be strncmp(str1, str2 + offset, length)
).
Say your string has a length of 120, and consists of a substring of length 6 repeated 20 times. You can look at it also as consisting of a sublength (length of substring) 12 repeated 10 times, sublength 24 repeated 5 times, sublength 30 repeated 4 times, or sublength 60 repeated 2 times (the sublengths are given by the prime factors of 20 (2*2*5) applied in different combinations to 6). Now, if you check whether your string contains a sublength of 60 repeated 2 times, and the check fails, you can also be sure that it won't contain any sublength which is a divisor (i.e., a combination of prime factors) of 60, including 6. In other words, many checks made by the above code are redundant. E.g., in the case of length 120, the above code checks (luckily failing quickly most of the time) the following sublengths: 1, 2, 3, 4, 5, 6, 8, 10, 12, 15, 20, 24, 30, 40, 60 (in this order: 1, 60, 2, 40, 3, 30, 4, 24, 5, 20, 6, 15, 8, 12, 10). Of these, only the following are necessary: 24, 40, 60. These are 2*2*2*3, 2*2*2*5, 2*2*3*5, i.e., the combinations of primes of 120 (2*2*2*3*5) with one of each (nonrepeating) prime taken out, or, if you prefer, 120/5, 120/3, 120/2. So, forgetting for a moment that efficient prime factorization is not a simple task, we can restrict our checks of repeating substrings to p substrings of sublength length/p, where p is a prime factor of length. The following is the simplest nontrivial implementation:
function substr_repeats(str, sublen, subcount) { see above }
function distinct_primes(n)
{
var primes = n % 2 ? [] : [2];
while (n % 2 == 0) {
n /= 2;
}
for (var p = 3; p * p <= n; p += 2) {
if (n % p == 0) {
primes.push(p);
n /= p;
while (n % p == 0) {
n /= p;
}
}
}
if (n > 1) {
primes.push(n);
}
return primes;
}
function is_periodic(str)
{
var len = str.length;
var primes = distinct_primes(len);
for (var i = primes.length - 1; i >= 0; i--) {
var sublen = len / primes[i];
if (substr_repeats(str, sublen, len / sublen)) {
return true;
}
}
return false;
}
Trying out this code on my Linux PC I had a surprise: on Firefox it was much faster than the first version, but on Chromium it was slower, becoming faster only for strings with lengths in the thousands. At last I found out that the problem was related to the array that distinct_primes()
creates and passes to is_periodic()
. The solution was to get rid of the array by merging these two functions. The code is below and the test results are on http://jsperf.com/periodic-strings-1/5
function substr_repeats(str, sublen, subcount) { see at top }
function is_periodic(str)
{
var len = str.length;
var n = len;
if (n % 2 == 0) {
n /= 2;
if (substr_repeats(str, n, 2)) {
return true;
}
while (n % 2 == 0) {
n /= 2;
}
}
for (var p = 3; p * p <= n; p += 2) {
if (n % p == 0) {
if (substr_repeats(str, len / p, p)) {
return true;
}
n /= p;
while (n % p == 0) {
n /= p;
}
}
}
if (n > 1) {
if (substr_repeats(str, len / n, n)) {
return true;
}
}
return false;
}
Please remember that the timings collected by jsperf.org are absolute, and that different experimenters with different machines will contribute to different combinations of channels. You need to edit a new private version of the experiment if you want to reliably compare two JavaScript engines.