I have some data in this format:
#> # A tibble: 2 × 2
#> record id
#> <int> <chr>
#> 1 1 "<a href=\"https://www.example.com/dir1/dir2/8379\">8379</a>"
#> 2 2 "<a href=\"https://www.example.com/dir1/dir2/8179\">8179</a>"
I would like to use stringr
to be left with just the part of the string between ">" and "<".
So my desired output would be:
#> # A tibble: 2 × 2
#> record id
#> <int> <chr>
#> 1 1 "8379"
#> 2 2 "8179"
I have tried using str_match
:
str_match(df$id, pattern = ">(....)<")
and the second column is what I'm after:
#> [,1] [,2]
#> [1,] ">8379<" "8379"
#> [2,] ">8179<" "8179"
How do I know use it in say a mutate
command to change a column in the dataframe?
Tidyverse solutions preferred, but open to all answers.
Code for data entry below.
library(tidyverse)
df <- tibble::tribble(
~record, ~id,
1L, "<a href=\"https://www.example.com/dir1/dir2/8379\">8379</a>",
2L, "<a href=\"https://www.example.com/dir1/dir2/8179\">8179</a>"
)
df
str_match(df$id, pattern = ">(....)<")