Extract first url from text

Question

Let's say we have a sentence like this:

sentence="thanks for coming please visit https://www.stackoverflow.com for more or look me up on https://www.linkedin.com"

sentence=as.data.frame(sentence)

I'd like to extract the first url only

This method works when a sentence contains one url, but not when there are multiple

library(qdapRegex)

#Extract Url
sentence[["URL"]] <- unlist(rm_url(sentence[["sentence"]], extract=TRUE))

Any ideas would be highly appreciated.

Possible duplicate of [Regular expression to find URLs within a string](https://stackoverflow.com/questions/6038061/regular-expression-to-find-urls-within-a-string) **or** [extracting first value from a list](https://stackoverflow.com/questions/20950221/extracting-first-value-from-a-list) — ctwheels, Sep 07 '17 at 16:14
The `str_extract` function from the stringr package should work: `trimws(str_extract(sentence$sentence, "http.+? "))` Assumes the url ends with a space. — Dave2e, Sep 07 '17 at 16:26

score 0 · Accepted Answer · answered Sep 07 '17 at 18:28

0

You need to index for the first element:

#Extract Url
sentence[["URL"]] <- unlist(rm_url(sentence[["sentence"]], extract=TRUE))[1]

answered Sep 07 '17 at 18:28

Kelli-Jean

1,417
11
17

Extract first url from text

1 Answers1