Here I have a scala UDF that checks if a url is one of my domains. To check if 'to_site' is one of my domains, I'm using indexOf
in javascript.
CREATE TEMPORARY FUNCTION our_domain(to_site STRING)
RETURNS BOOLEAN
LANGUAGE js AS """
domains = ['abc.com', 'xyz.com'];
if (to_site == null || to_site == undefined) return false;
for (var i = 0; i < domains.length; i++){
var q= DOMAIN('XYZ');
if (String.prototype.toLowerCase.call(to_site).indexOf(domains[i]) !== -1)
return true;
}
return false;
""";
SELECT our_domain('www.foobar.com'), our_domain('www.xyz.com');
This returns false, then true.
It would be much nicer if I could use the DOMAIN(url) function from javascript. indexOf
is not very good because it will match www.example.com?from=www.abc.com, when actually example.com is not one of my domains. Javascript also has a (new URL('www.example.com/q/z')).hostname to parse the domain component, but it includes the subdomain like 'www.' which complicates the comparison. Bigquery's DOMAIN(url) function only gives the domain and knowing google it's fast C++.
I know I can do this
our_domain(DOMAIN('www.xyz.com'))
But in general it would be nice to use some of the bigquery API functions in javascript. Is this possible?
I also tried this
CREATE TEMPORARY FUNCTION our_domain1(to_site String)
AS (our_domain(DOMAIN(to_site));
but it fails saying DOMAIN does not exist.