If you have a specific query-parameter key in mind, its pretty easy to draft a regex to capture it. For gclid
for example, the regex would be [\?&;]gclid=([^&|;]+)
, where the capture group contains the value.
Given that, we can use redshift's REGEXP_REPLACE
method to extract the value from the url:
select REGEXP_REPLACE(url, '.*[\?&;]gclid=([^&|;]+).*', '$1')
Note: we use REGEXP_REPLACE
instead of REGEXP_SUBSTR
in order to return the capture group's value, instead of the whole match
Finally, we can generalize this solution to work for any given query-parameter key with the following function
/**
* extracts the value of a query parameter from a url, using regex
*/
CREATE OR REPLACE FUNCTION get_queryparam_from_url(url varchar, key varchar)
RETURNS varchar
IMMUTABLE
AS $$
select
CASE WHEN (REGEXP_REPLACE($1, '.*[\?&;]' || $2 || '=([^&|;]+).*', '$1') = $1)
THEN null -- if nothing was replaced, no match
ELSE REGEXP_REPLACE($1, '.*[\?&;]' || $2 || '=([^&|;]+).*', '$1') -- otherwise, this will be the capture group's value
END
$$ LANGUAGE sql;
usage
select get_queryparam_from_url(route, 'gclid')