1

I am making a chrome extension that is given a list of domains that needs to be compared against the active URL of a tab. For example if the list of domains has "google" then the extension should detect "docs.google.com" as part of the domain list. I have gotten this part to work. The issue is when the domain list contains a subdomain. For example: if "docs.google" is on the list then if the user is on "google.com" the extension should not recognize this as a URL on the domain list.

I am attempting this by constructing a regular expression. for each domain and subdomain. As I said, when you are given a domain (as opposed to a subdomain) it works properly although I have tested this with subdomains and it does not seem to work. I assume the issue is with how I constructed the RegEx. Anything that stands out? thank you in advance!

let onDomainList = false;
for(let i = 0; i < domainListLength-1; i++){
                if(!domainList[i].includes(".")){ //if this domain is not a subdomain
                    let strPattern = "^https://www\\." + list.domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') + "|https://[a-z_]+\\." + list.domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
                    let domainRegEx = new RegExp(strPattern,'i');
                    if(domainRegEx.test(activeTab.url)){
                        onDomainList = true;
                        execute_script(activeTab);
                    }
                } else{ //if this domain is a subdomain
                    let strPattern = "^https://www\\." + list.domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
                    let domainRegEx = new RegExp(strPattern,'i');
                    if(domainRegEx.test(activeTab.url)){
                        onDomainList = true;
                        execute_script(activeTab);
                    }
                }
            }

EDIT: Changed RegEx to what Wiktor Stribizew suggested, although still the issue of not detecting subdomains.

Jonnes
  • 29
  • 5
  • 1
    You have a `/` at the start, before `^`. You need to remove this `/`. Also, you need to double escape the `.`s since you are using a constructor notation. Also, `domainList[i]` must be escaped. Try `let strPattern = "^https://www\\." + domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') + "|https://[a-z_]+\\." + domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');`. And then also `let strPattern = "^https://www\\." + domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');` – Wiktor Stribiżew Jul 22 '22 at 08:31
  • Seems that the same problem exists of not detecting subdomains. Although what you said does work for normal domains. – Jonnes Jul 22 '22 at 08:58
  • 1
    Maybe because you do not match anything but `www` and `list.domainList[i]`? Try matching anything but `/`: `let strPattern = "^https://(?:[^\\s/]*\\.)?" + list.domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');`. It is hard to help without one or two test cases. – Wiktor Stribiżew Jul 22 '22 at 09:02
  • I'm sorry, I'm not sure I understand your explanation or provided Regex. Regardless this did work! I thought I understood RegEx decently but the line you provided doesnt look like anything to me. Either way I really appreciate you answering this, thank you. – Jonnes Jul 22 '22 at 09:06
  • 1
    I will add the answer with explanations – Wiktor Stribiżew Jul 22 '22 at 09:09

1 Answers1

1

Here is a fixed snippet:

let onDomainList = false;
for (let i = 0; i < domainListLength - 1; i++) {
  if (!domainList[i].includes(".")) { //if this domain is not a subdomain
    let strPattern =
      let strPattern = "^https://www\\." + domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') + "|https://[a-z_]+\\." + domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
    let domainRegEx = new RegExp(strPattern, 'i');
    if (domainRegEx.test(activeTab.url)) {
      onDomainList = true;
      execute_script(activeTab);
    }
  } else { //if this domain is a subdomain
    let strPattern = "^https://(?:[^\\s/]*\\.)?" + list.domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
    let domainRegEx = new RegExp(strPattern, 'i');
    if (domainRegEx.test(activeTab.url)) {
      onDomainList = true;
      execute_script(activeTab);
    }
  }
}

Notes:

  • Since you are using a RegExp constructor notation, and define the regex with a regular string literal, you need to properly introduce backslashes used to escape special chars. Here, there is no need to escape / and the . needs two backslashes, the "\\." string literal is actually a \. text
  • The variable texts need escaping to be used properly in the code, hence domainList[i].replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')
  • The / before ^ renders the regex useless since there can be no / before the start of string, and thus /^ is a regex that never matches any string. / as regex delimiters should not be used in RegExp constructor notation
  • A subdomain regex does not actually match anything but https://www. + the domain from your list. To allow anything before the domain, you can replace www\. with (?:[^\s/]*\.)? that matches an optional sequence ((?:...)? is an optional non-capturing group) of zero or more chars other than whitespace and / (with the [^\/s]* negated character class) and then a dot.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • the .replace(....) used as a .escape method, this is used for when there is a special character in the string the regex ignores it? Everything else you said makes sense and I appreciate the work you put in. – Jonnes Jul 22 '22 at 17:37
  • 1
    @Jonnes Right, if there are no special chars, no replacement will be done. If the list items only contain letters, digits or underscores, remove this. – Wiktor Stribiżew Jul 22 '22 at 19:16
  • got it. I can see how that is crucial to have. Thank you again. – Jonnes Jul 22 '22 at 22:06