Best way to split URL into array in clean way?

Question

I need to extract the directory and file name in a different input of user URL's.

Some examples would include:

https://foo/s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg
http://192.168.12.44:8090/TOP_PROD_IMAGE/R3CRDT-HZWT_IMRO_1.jpg
www.foobar-images.s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg

What I really need is the TOP_PROD_IMAGE and WS-25612-BK_IMRO_1.jpg file name.

So I would need to account for users who enter http:// or https:// or just www. so I tried using string.split('/') but that obviously wouldn't work in all cases. Is there something that could give me an array despite the double // in cases where user enters http? Thanks!

I'd use path-to-regexp for this. it's used within Express internally, and can be quite robust. https://www.npmjs.com/package/path-to-regexp If this really is just a one-off use case though, you could do it directly with regex. — Brad, Dec 23 '19 at 23:08
This question reminded me of a similar one [here](https://stackoverflow.com/a/45075028/4003419) But this one should be a lot easier. — LukStorms, Dec 23 '19 at 23:11

score 6 · Accepted Answer · answered Dec 23 '19 at 23:08

6

Consider:

const [file, folder] = url.split('/').reverse();

With this you wouldn't need to consider http:// or any //

answered Dec 23 '19 at 23:08

C.OG

6,236
3
20
38

dvlden · Answer 2 · 2019-12-23T23:12:14.513

4

How about:

const url = new URL('https://foo/s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg')
const urlParams = url.pathname.split('/') // you'll get array here, so inspect it and get last two items

Will this do the trick? You'll get exactly what you need within the pathname.

edited Dec 23 '19 at 23:12

answered Dec 23 '19 at 23:08

dvlden

2,402
8
38
61

1

Also handled query parameters gracefully – ug_ Dec 23 '19 at 23:10
Wow, this is super nice! Also anyway to isolate file name? I guess I could work with `URL.pathname` would just be nicer to have them separate. – rec0nstr Dec 23 '19 at 23:11
1

The answer below will get you there. Answer from @C_Ogoo. – dvlden Dec 23 '19 at 23:16

score 0 · Answer 3 · answered Dec 23 '19 at 23:36

If the urls have to start with either http and optional s or www. you could also use a pattern with 2 capturing groups to get the part before the last slash and the part after the last slash.

^(?:https?:\/\/|www\.)\S+\/([^/]+)\/(\S+)$

Regex demo

urls = [
  "https://foo/s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg",
  "http://192.168.12.44:8090/TOP_PROD_IMAGE/R3CRDT-HZWT_IMRO_1.jpg",
  "www.foobar-images.s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg"
].forEach(s => {
  let m = s.match(/^(?:https?:\/\/|www\.)\S+\/([^/]+)\/(\S+)$/, s);
  console.log(m[1]);
  console.log(m[2]);
  console.log("\n");
});

symlink · Answer 4 · 2019-12-24T00:16:24.163

You can use negative look-aheads to only match the final URI segments:

/(?!([https?:\/\/]|[www.]))(?!([\d]))(?!(.*[com])).*/

const re = /(?!([https?:\/\/]|[www.]))(?!([\d]))(?!(.*[com])).*/
const arr = [
  "https://foo/s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg",
  "http://192.168.12.44:8090/TOP_PROD_IMAGE/R3CRDT-HZWT_IMRO_1.jpg",
  "www.foobar-images.s3.amazonaws.com/TOP_PROD_IMAGE/WS-25612-BK_IMRO_1.jpg"
]

const res = arr.map(str => re.exec(str)[0].split("/"))

console.log(res)

Best way to split URL into array in clean way?

4 Answers4