0

I need to be able to check a URL to see if it contains a second level domain (SLD) for a valid streaming service. Note, the "hulu" in www.hulu.com is what I mean by an SLD.

Rather than parsing the URL with regex to get just the SLD, or using something like location.hostname.split('.').pop() to get the SLD, I thought I could use indexOf instead. And this works great (for me at least, I realize it's got at least one serious limitation - see my note below).

Example. Let's say I want to see if https://www.hulu.com/watch/... is an Hulu link. This works:

let string = 'https://www.hulu.com/watch/...';
string.indexOf('hulu') > -1 ? true : false; // returns true

What I want to be able to do is pass an array of possible strings into indexOf. Something like this:

let validSLDs = ['hulu','netflix']

let string1 = 'www.hulu.com/watch/...';
let string2 = 'http://www.netflix.com/watch/...';
let string3 = 'imdb.com/title/....'

string1.indexOf(validSLDs); // returns true ('hulu' is a valid SLD)
string2.indexOf(validSLDs); // returns true ('netflix' is a valid SLD)
string3.indexOf(validSLDs); // returns false ('imdb' is not a valid SLD)

But of course this doesn't work because indexOf() is expecting to be passed a string, not an array of strings.

So is there some similarly easy, elegant (vanilla JS) solution that I'm not thinking of?

The next easiest thing I could think of would be to loop through my array of validSLDs and call indexOf on each of the SLDs. And maybe that is the best approach. I just thought I'd see if anyone else had a better solution. Thanks in advance!

NOTE: I realize that my entire approach is a lazy approach and could result in possible issues. For example, https://www.amazon.com/how-cancel-hulu-subscription-membership/... would also return true using the code above, because the word "hulu" exists in the string ... but isn't an SLD. I'm ok with that because we have some control over the URL's we need to validate.

Tyler Youngblood
  • 2,710
  • 3
  • 18
  • 24
  • 1
    Is using [URL API](https://developer.mozilla.org/en-US/docs/Web/API/URL_API) *plain* enough? – PM 77-1 Feb 22 '21 at 19:56
  • 2
    You need to do it the other way around [`forEach`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/forEach) or even better for [`every`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/every) or maybe just for [`some`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/some) string item of/in your array of `validSLDs` you have to check whether it is part of the to be validated string. – Peter Seliger Feb 22 '21 at 20:02
  • You hopefully are also aware that this approach is not bulletproof for e.g. `hulu` for whatever reason could be part of the pathname of a totally different domain. – Peter Seliger Feb 22 '21 at 20:08
  • @PM77-1 the URL API won't give me the Second Level Domain though, correct? We already have a list of valid SLDs (hoopla, kanopy, etc) but they aren't in the www.hoopla.com format. They are literally just the SLD without the www and the .com. So I was trying to avoid needing to strip that out before comparing. But maybe I'm missing something in the URL API? – Tyler Youngblood Feb 22 '21 at 21:59
  • @PeterSeliger yea, I definitely thought about the fact that this approach isn't bullet proof. But the truth is I'm dealing with a few very rare outlier URLs. 99.9% of the URLs we are dealing with at work are correct. There has so far only been a single URL discovered (out of the hundreds of thousands that we deal with) that links to IMDB instead of a valid streaming service. And no, it's not because it's an IMDB-TV link. I checked that. I could have easily blacklisted IMDB but instead thought it would be safer to whitelist our known valid SLDs. – Tyler Youngblood Feb 22 '21 at 22:04

4 Answers4

2

Just make a little helper function that does what you said, loops through the array and checks each value. An efficient way to do that is with Array.some, as it will return true as soon as it finds a truthy match.

let validSLDs = ['hulu','netflix']

let string1 = 'www.hulu.com/watch/...';
let string2 = 'http://www.netflix.com/watch/...';
let string3 = 'imdb.com/title/....'

const isURLOK = (testString) => validSLDs.some(v => testString.indexOf(v) > -1);

console.log(isURLOK(string1));
console.log(isURLOK(string2));
console.log(isURLOK(string3));
James
  • 20,957
  • 5
  • 26
  • 41
  • 1
    ... why not making use of [`String.prototype.includes`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/includes) instead of `indexOf`? – Peter Seliger Feb 22 '21 at 20:11
  • @PeterSeliger Sure, you could do. I know that OP is familiar with indexOf. – James Feb 22 '21 at 20:13
  • 1
    @PeterSeliger that's an excellent point! I don't actually need the index at all. includes() is a much better approach in this situation. It's been a long day. Should have thought of that. Thx! – Tyler Youngblood Feb 22 '21 at 22:17
  • @James I do appreciate you sticking with indexOf() for your answer. I happen to be familiar with includes, but if I hadn't been I would have had two functions to reasearch. As it stands, I only have to brush up on array.some :) – Tyler Youngblood Feb 22 '21 at 22:24
0
let string1 = 'www.hulu.com/watch/...';
let string2 = 'http://www.netflix.com/watch/...';
let string3 = 'imdb.com/title/....'
let unverfiedSLDs = [string1, string2, string3]
let validSLDs = ['hulu', 'netflix', 'imdb'];

let allSLDsAreValid = unverifiedSLDs.every(s => controlSLDs.includes(s))

allSLDsareValid is a boolean, it evaluates to true if all strings are valid, and false if there is at least one invalid string.

If instead you want to keep track of which SLDs are valid and which are not, try using an object:

let validatedStrings = {}
unverifiedSLDs.forEach(s => {
    if (validSLDs.includes(s)) {
        SLDs[s] = true;
    }
})

Then when you need to access the SLD's validity, you can check for the key existence in validatedStrings:

validatedStrings[string1] = true
validatedStrings[string2] = true
validatedStrings[string3] = true
validatedStrings['not-valid'] = undefined
Katie McCulloch
  • 364
  • 2
  • 7
0

I really appreciate all the suggestions. Ultimately I went with a combination of suggestions and did the following.

let link = 'www.imdb.com/title/...';

validateLink = (link) => {
  let validSLDs = ['hoopla', 'kanopy' ... ];

  for(let i in validSLDs) {
    if(link.includes(validSLDs[i])) {
      return true;
    };
  };
  return false;
};

Which seems to work just fine (although it isn't bulletproof, as mentioned in my original NOTE and in a comment). Passing the IMDB link into the function returned false, while passing a hoopla or kanopy link returned true. I might try and refactor a bit more ... but this should work for my purposes.

Then Peter Seliger wrote an even more succinct version (below) that works even better. Especially if you're not confused by nested arrow functions! Click on the "Run Code Snippet" button below to see Peter's improved answer in action.

Thanks again for the help everyone!

Edit by Peter Seliger in order to clarify some code behavior

Q: OP in comment ...

"... But I ran into some issues. So then I tried copy/pasting your solution into codepen and I'm getting a TypeError: cannot read property 'includes' of undefined. ..."

A:

If I do a copy and paste into e.g. the browser console, and do invoke validateLink('https://www.hulu.com/watch/...') it returns false ... invoking validateLink('https://www.hoopla.com/watch/...') returns true.

And even if ['hoopla', 'kanopy', /*...*/] inside the function was a sparse array (array with empty slots , which it is not) the iteration would work, because every array method skips empty slots.

executable code snippet for proofing the above said ...

const validateLink = link =>
  ['hoopla', 'kanopy', /*...*/].some(sdl => link.includes(sdl));

console.log(
  "validateLink('https://www.hulu.com/watch/...') ?",
  validateLink('https://www.hulu.com/watch/...') // false
);
console.log(
  "validateLink('https://www.hoopla.com/watch/...') ?",
  validateLink('https://www.hoopla.com/watch/...') // true
);
Tyler Youngblood
  • 2,710
  • 3
  • 18
  • 24
  • 1
    `const validateLink = link => ['hoopla', 'kanopy', /*...*/].some(sdl => link.includes(sdl));` does exactly the same like your above approach, just much shorter and also more expressive. `some` under the hood is implemented similar to what you came up with. Both `some` and `every` do return a boolean value, and both do immediately exit/break their inner loops as soon as the condition either was met (`some`) or got violated (`every`). – Peter Seliger Feb 22 '21 at 23:15
  • @PeterSeliger I appreciate you returning to this question multiple times to help! I wanted to try and rewrite your function without using nested arrow functions so I could more easily wrap my head around what's happening. But I ran into some issues. So then I tried copy/pasting your solution into codepen and I'm getting a `TypeError: cannot read property 'includes' of undefined`. But `link` is defined a few lines above in my code. Is there something else I'm missing? – Tyler Youngblood Feb 24 '21 at 22:35
  • If I do a *copy and paste* into e.g. the browser console right now, and do invoke `validateLink('https://www.hulu.com/watch/...')` it returns `false` ... invoking `validateLink('https://www.hoopla.com/watch/...')` returns `true`. And even if `['hoopla', 'kanopy', /*...*/]` inside the function was a [sparse array](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/forEach#sparsearray) (array with `empty` slots , which it is not) the iteration would work, because every array method skips `empty` slots. Thus I can not even guess what went wrong at your codepen. – Peter Seliger Feb 25 '21 at 08:29
  • Ah! It was the nested arrow function that confused me. I was trying to execute validateLink() without passing the link in as a parameter, which is obviously wrong. This works perfectly. Thanks @PeterSeliger! – Tyler Youngblood Feb 25 '21 at 15:29
  • Consider using `for (let i of validSLDs)` instead. See https://stackoverflow.com/questions/500504/why-is-using-for-in-for-array-iteration-a-bad-idea for more information. – Heretic Monkey Feb 25 '21 at 15:44
-1

you could try to use the filter function of arrays:

let validSLDs = ['hulu','netflix']

let string1 = 'www.hulu.com/watch/...';
let string2 = 'http://www.netflix.com/watch/...';
let string3 = 'imdb.com/title/....'

validSLDs.filter(sld => string1.indexOf(sld) !== -1).length > 0; // returns true ('hulu' is a valid SLD)
validSLDs.filter(sld => string2.indexOf(sld) !== -1).length > 0; // returns true ('netflix' is a valid SLD)
validSLDs.filter(sld => string3.indexOf(sld) !== -1).length > 0; // returns false ('imdb' is not a valid SLD)
fmilani
  • 521
  • 2
  • 9