0

I have a question about regexp in javascript.

I want to detect whether a string has a substring which has a repeatitive characters or words.

For example, string "aaaaabcd" has a repeatitive substrings of a or aa

but string "abcdefghij" does not have any repeatitive substring.

I made a RegExp in javascript to detect it.

const written_contents = "aaaaaabcd"
const re = new RegExp("(\w+)\1{3,}", "g")
if (re.test(written_contents) ) {
    return "repetition detected."
}

My intentions was detecting 3+ same words or characters are repeated.

Let me explain my logic to reach that Regexp

if string is "aaaaaabc",
\w+ will catch any subset made of 1+ characters like a, aa, aaa, b, c, aaab, aabc, aaabc.

(\w+)\1 \1 points to the 1st parenthesis. Here it is (\w)

And {3, } means \1 is repeated more than 3 times.

I gave "g" option to search the whole string.

Now I expect "aaaaa" is captured because first a is \w, second a is \1, third a to fifth a is {3,} thus "aaaaa" matches.

But the code does not work.

What's wrong?

Jeehoon Park
  • 61
  • 1
  • 8
  • 1
    You have to double escape the backslash in the RegExp constructor `(\\w+)\\1{3,}` You can write it as `const re = /(\w+)\1{3,}/g;` – The fourth bird Mar 13 '21 at 10:36
  • 1
    I don't fully understand, why `\w+`, if you just want one character to repeat, and not a sequence, but other than that: [why-do-regex-constructors-need-to-be-double-escaped](https://stackoverflow.com/questions/17863066/why-do-regex-constructors-need-to-be-double-escaped) – ASDFGerte Mar 13 '21 at 11:00
  • Thanks 'The fourth bird' . I decided not to use RegExp. / / is more straight-forward – Jeehoon Park Mar 13 '21 at 11:19
  • Thanks ASDFGerte. Your comment is right. I change it to (.+) – Jeehoon Park Mar 13 '21 at 11:20
  • Thanks pilchard. I applied your advice and it succeeded. – Jeehoon Park Mar 14 '21 at 00:40

1 Answers1

1

I found what was the problem, though I still don't know why my previous code was wrong in terms of javascript grammar.

const re = /(.+)\1{1,}/

if (re.test(written_contents) ) {
    return "repetition detected"
}

The above code works.

Strangely,

re = new RegExp('(.+)\1{1,}')  

did not work.

Jeehoon Park
  • 61
  • 1
  • 8
  • You should escape \ symbol when using RegExp constructor. Try `re = new RegExp('(.+)\\1{1,}')` – JongHyeon Yeo Mar 13 '21 at 13:40
  • @jong-hyeon-yeo Thanks a lot. I feel that / xxx / is more convenient than RegExp() expression because I don't have to use escape \ symbol as you advised. Now I understand why my previous code did not work. – Jeehoon Park Mar 14 '21 at 00:43