I'm struggling to figure out the best way to strip out all the content in a URL from a specific keyword onwards (including the keyword), using either regex or a substring operation. So if I have an example dynamic URL http://example.com/category/subcat/filter/size/1/
- I would like to strip out the /filter/size/1 element of the URL and leave me with the remaining URL as a separate string. Grateful for any pointers. I should clarify that the number of arguments after the filter
keyword isn't fixed and could be more than in my example and the number of category arguments prior to the filter
keyword isn't fixed either

- 1,287
- 6
- 16
- 32
-
`'http://example.com/category/subcat/filter/size/1/'.replace(/^.*filter\/size\/1/, '')` try [regex101.com](https://regex101.com). – RobG Oct 11 '20 at 14:14
5 Answers
Use the split()
function.
url='http://example.com/category/subcat/filter/size/1/';
console.log(url.split('/filter')[0]);

- 4,417
- 17
- 30
- 41
-
Thanks for the answer - that won't catch variable nested levels of subcategories - e.g http://example.com/category/subcat/subsubcat/filter/size/1/color/black - sorry i should have clarified that in the OP – bsod99 Oct 11 '20 at 14:29
-
-
To be a little safer you could use the URL
object to handle most of the parsing and then
just sanitize the pathname
.
const filteredUrl = 'http://example.com/category/subcat/filter/test?param1¶m2=test';
console.log(unfilterUrl(filteredUrl));
function unfilterUrl(urlString) {
const url = new URL(urlString);
url.pathname = url.pathname.replace(/(?<=\/)filter(\/|$).*/i, '');
return url.toString();
}

- 42,889
- 6
- 74
- 90
You can tweak this a little based on your need. Like it might be the case where filter
is not present in the URL. but lets assume it is present then consider the following regex expression.
/(.*)\/filter\/(.*)/g
the first captured group ( can be obtained by $1
) is the portion of the string behind the filter
keyword and the second captured group ( obtained by $2
) will contain all your filters present after the filter
keyword

- 1,451
- 3
- 20
- 42
Split
The simplest solution that occurs to me is the following:
const url = 'http://example.com/category/subcat/filter/size/1/';
const [base, filter] = url.split('/filter/');
// where:
// base == 'http://example.com/category/subcat'
// filter == 'size/1/'
If you expect more than one occurrence of '/filter/'
, use the limit parameter of String.split()
: url.split('/filter/', 2);
RegExp
The assumption of the above is that after the filter
parameter, everything is part of the filter. If you need more granularity, you can use a regex that terminates at the '?'
, for example. This will remove everything from 'filter/anything/that/follows'
that immediately follows a /
and until the first query string separator ?
, not including.
const filterRegex = /(?<=\/)filter(\/|$)[^?]*/i;
function parseURL(url) {
const match = url.match(filterRegex);
if (!match) { return [url, null, null]; } // expect anything
const stripped = url.replace(filterRegex, '');
return [url, stripped, match[0]];
}
const [full, stripped, filter] = parseURL('http://example.com/category/subcat/filter/size/1/?query=string');
// where:
// stripped == 'http://example.com/category/subcat/?query=string'
// filter == 'filter/size/1/'

- 13
- 4
I'm sadly not able to post the full answer here, as i'ts telling me 'it looks like spam'. I created a gist with the original answer. In it i talk about the details of String.prototype.match
and of JS/ES regex in general including named capture groups and pitfalls. And incude a link to a great regex tool: regex101
. I'm not posting the link here in fear of triggering the filter again. But back to the topic:
In short, a simple regext can be used to split and format it (using filter
as the keyword):
/^(.*)(\/filter\/.*)$/
or with named groups:
/^(?<main>.*)(?<stripped>\/filter\/.*)$/
(note that the forward slashes need to be escaped in a regex literal)
Using String.prototype.match
with that regex will return an array of the matches: index 1 will be the first capture group (so everything before the keyword), index 2 will be everything after that (including the keyword).
Again, all the details can be found in the gist

- 194
- 1
- 4