I've this basic data frame:
I want to search in a column, for a SKU (8 digits), put this in a variable (capturing group), and then put it in a new column: "SKU_solo".
I don't need the "\1" but the first 8 number digits. How to make the capturing group within my code?
This is my code:
I'm using "dplyr"
urls_na <- urls_na %>%
mutate(SKU_solo = NA, #initialize the new column
SKU_solo = ifelse(grepl("([0-9]+)", Page), "\\1",SKU_solo))
Page Categoria Page.Views SKU_solo
1 5 /Cajon_Criolla_20141024 #N/A 7 \1
2 6 /Linon_20141115_20141130 #N/A 564 \1
3 7 /Cat/LIQUID #N/A 1 NA
4 8 /c_puertas_20141106_20141107 #N/A 34 \1
5 9 /C_Puertas_3_20141017_20141018 #N/A 2 \1
6 10 /c_puertas_navidad_20141204_20141205 #N/A 187319 \1
Desired ouput:
Page Categoria Page.Views SKU_solo
1 5 /Cajon_Criolla_20141024 #N/A 7 20141024
2 6 /Linon_20141115_20141130 #N/A 564 20141115
3 7 /Cat/LIQUID #N/A 1 NA
4 8 /c_puertas_20141106_20141107 #N/A 34 20141106
5 9 /C_Puertas_3_20141017_20141018 #N/A 2 20141017
6 10 /c_puertas_navidad_20141204_20141205 #N/A 187319 20141204
NOTES:
1) ifelse and grepl help to make the capturing and replacement. How ever, it just return: \1 as string.
2) There could be another numbers, like in line 5. But the important one is the first SKU (8 digits group).
UPDATE:
As you see, i can get "\1" to print in the SKU_solo column. I know there are other ways of doing this, but what is wrong with my code?
I want to use the "Capturing group" characteristic from Regex. I've read that, it assigns values 1 to ... from left to right when something is within "()". In my code: ifelse(grepl("([0-9]+)", Page), "\\1",SKU_solo))
... ([0-9]+)
should be assigend number 1... that is why after i use: "\1" to make reference to it. I don't get, why it does not work, and only puts : "\1" in the "SKU_solo" Column.