-1

I have several download links (i.e., strings), and each string has different length.

For example let's say these fake links are my strings:

My_Link1 <- "http://esgf-data2.diasjp.net/pr/gn/v20190711/pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc"

My_Link2 <- "http://esgf-data2.diasjp.net/gn/v20190711/pr_-present_r1i1p1f1_gn_19500101-19591231.nc"

My goals:

A) I want to have only the last part of each string ended by .nc , and get these results:

pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc

pr_-present_r1i1p1f1_gn_19500101-19591231.nc

B) I want to have only the last part of each string before .nc , and get these results:

pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231

pr_-present_r1i1p1f1_gn_19500101-19591231

I tried to find a way on the net, but I failed. It seems this can be done in Python as documented here:

How to get everything after last slash in a URL?

Does anyone know the same method in R?

Thanks so much for your time.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Canada2015
  • 187
  • 1
  • 12
  • 1
    `gsub(".*/(.*)\\.nc", "\\1", My_Link2)` – r2evans Oct 04 '19 at 21:58
  • Perfectly works for the second goal. Based on your comment, I found the answer for the first goal, too. It would be like this: `gsub(".*/(.*\\.nc)", "\\1", My_Link2)` .It was a great help. Thanks a lot. – Canada2015 Oct 04 '19 at 22:14

1 Answers1

1

A shortcut to get last part of the string would be to use basename

basename(My_Link1)
#[1] "pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc"

and for the second question if you want to remove the last ".nc" we could use sub like

sub("\\.nc", "", basename(My_Link1))
#[1] "pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231"

With some regex here is another way to get first part :

sub(".*/", "", My_Link1)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213