0

I am trying to replace some values for a variable within my data set but I keep getting an unexpected value of 414 assigned instead of 9. I've been over the code a number of times but just cannot get it working.

My code

#replace tumor_size with dummy variable 
Bcdata$Tumor_size=gsub('0-4',1,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('5-9',2,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('10-14',3,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('15-19',4,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('20-24',5,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('25-29',6,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('30-34',7,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('35-39',8,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('40-44',9,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('45-49',10,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('50-54',11,Bcdata$Tumor_size)
Bcdata$Tumor_size=gsub('55-59',12,Bcdata$Tumor_size)

Table before and after I run my code

> table(Bcdata$Tumor_size)

  0-4 10-14 15-19 20-24 25-29 30-34 35-39 40-44 45-49   5-9 50-54 
    8    28    30    50    54    60    19    22     3     4     8

> table(Bcdata$Tumor_size)

  1  10  11   2   3   4 414   5   6   7   8 
  8   3   8   4  28  30  22  50  54  60  19 
> 

And a sample of the data.

> head(Bcdata)
                 Class   Age Menopause Tumor_size Inv_nodes Node_caps Deg_malig Breast Irradiate
1 no-recurrence-events 30-39   premeno      30-34       0-2        no         3   left        no
2 no-recurrence-events 40-49   premeno      20-24       0-2        no         2  right        no
3 no-recurrence-events 40-49   premeno      20-24       0-2        no         2   left        no
4 no-recurrence-events 60-69      ge40      15-19       0-2        no         2  right        no
5 no-recurrence-events 40-49   premeno        0-4       0-2        no         2  right        no
6 no-recurrence-events 60-69      ge40      15-19       0-2        no         2   left        no
> tail(Bcdata)
                Class   Age Menopause Tumor_size Inv_nodes Node_caps Deg_malig Breast Irradiate
281 recurrence-events 50-59      ge40      40-44       6-8       yes         3   left       yes
282 recurrence-events 30-39   premeno      30-34       0-2        no         2   left        no
283 recurrence-events 30-39   premeno      20-24       0-2        no         3   left       yes
284 recurrence-events 60-69      ge40      20-24       0-2        no         1  right        no
285 recurrence-events 40-49      ge40      30-34       3-5        no         3   left        no
286 recurrence-events 50-59      ge40      30-34       3-5        no         3   left        no

I keep attempting to rewrite the code to fix it, even though it looks right, then reset the data back to the raw values and run the code again but the same thing keeps happening. Help!!

EDIT: as requested, partial and full dput

> dput(Bcdata$Tumor_size)
structure(c(6L, 4L, 4L, 3L, 1L, 3L, 5L, 4L, 11L, 4L, 1L, 5L, 
2L, 5L, 6L, 6L, 3L, 6L, 6L, 6L, 8L, 3L, 5L, 8L, 7L, 5L, 4L, 5L, 
8L, 6L, 8L, 3L, 2L, 2L, 2L, 6L, 1L, 3L, 2L, 6L, 4L, 5L, 10L, 
2L, 11L, 6L, 5L, 5L, 4L, 4L, 3L, 4L, 3L, 4L, 8L, 8L, 1L, 10L, 
6L, 3L, 4L, 2L, 1L, 7L, 5L, 2L, 5L, 4L, 7L, 11L, 2L, 5L, 4L, 
3L, 10L, 2L, 2L, 5L, 5L, 5L, 2L, 2L, 3L, 3L, 4L, 7L, 5L, 1L, 
4L, 8L, 1L, 4L, 5L, 4L, 2L, 6L, 6L, 3L, 6L, 5L, 4L, 6L, 5L, 4L, 
2L, 6L, 4L, 8L, 6L, 6L, 5L, 3L, 4L, 2L, 7L, 4L, 3L, 4L, 2L, 3L, 
4L, 3L, 8L, 6L, 2L, 2L, 6L, 5L, 5L, 7L, 7L, 8L, 6L, 8L, 6L, 4L, 
8L, 10L, 8L, 6L, 8L, 4L, 2L, 9L, 9L, 5L, 11L, 6L, 4L, 6L, 5L, 
6L, 7L, 3L, 3L, 8L, 5L, 6L, 6L, 7L, 5L, 6L, 2L, 5L, 5L, 4L, 4L, 
8L, 2L, 6L, 4L, 3L, 6L, 4L, 5L, 6L, 5L, 2L, 5L, 4L, 7L, 7L, 5L, 
6L, 6L, 4L, 5L, 3L, 2L, 4L, 3L, 5L, 6L, 2L, 11L, 7L, 2L, 2L, 
3L, 5L, 5L, 3L, 8L, 7L, 5L, 1L, 6L, 5L, 6L, 7L, 4L, 4L, 6L, 5L, 
8L, 4L, 4L, 3L, 6L, 3L, 5L, 6L, 5L, 4L, 5L, 4L, 6L, 6L, 8L, 9L, 
11L, 6L, 6L, 3L, 6L, 5L, 5L, 5L, 7L, 4L, 4L, 3L, 5L, 4L, 6L, 
6L, 3L, 6L, 7L, 4L, 5L, 11L, 8L, 11L, 6L, 6L, 6L, 4L, 6L, 6L, 
5L, 5L, 5L, 4L, 4L, 7L, 6L, 4L, 7L, 5L, 6L, 5L, 3L, 6L, 6L, 5L, 
5L, 2L, 7L, 8L, 8L, 6L, 4L, 4L, 6L, 6L), .Label = c("0-4", "10-14", 
"15-19", "20-24", "25-29", "30-34", "35-39", "40-44", "45-49", 
"5-9", "50-54"), class = "factor")
> dput(Bcdata)
structure(list(Class = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("no-recurrence-events", 
"recurrence-events"), class = "factor"), Age = structure(c(2L, 
3L, 3L, 5L, 3L, 5L, 4L, 5L, 3L, 3L, 3L, 4L, 5L, 4L, 3L, 5L, 3L, 
4L, 5L, 4L, 4L, 5L, 2L, 4L, 4L, 3L, 4L, 5L, 3L, 5L, 4L, 4L, 4L, 
4L, 4L, 2L, 4L, 4L, 3L, 3L, 4L, 5L, 5L, 3L, 4L, 4L, 3L, 4L, 3L, 
3L, 4L, 2L, 4L, 6L, 6L, 6L, 4L, 4L, 5L, 5L, 3L, 3L, 4L, 1L, 3L, 
3L, 3L, 4L, 4L, 5L, 5L, 3L, 5L, 4L, 2L, 4L, 4L, 2L, 4L, 3L, 4L, 
5L, 5L, 4L, 3L, 4L, 5L, 6L, 4L, 3L, 2L, 4L, 4L, 5L, 4L, 3L, 5L, 
5L, 3L, 2L, 3L, 4L, 4L, 3L, 3L, 3L, 3L, 2L, 3L, 5L, 4L, 4L, 3L, 
3L, 3L, 4L, 2L, 3L, 2L, 5L, 5L, 4L, 4L, 4L, 5L, 6L, 2L, 2L, 4L, 
3L, 3L, 3L, 3L, 4L, 5L, 2L, 2L, 3L, 2L, 3L, 4L, 4L, 5L, 3L, 5L, 
3L, 5L, 4L, 2L, 4L, 4L, 5L, 4L, 5L, 2L, 5L, 4L, 4L, 4L, 3L, 3L, 
3L, 5L, 5L, 5L, 3L, 3L, 3L, 4L, 3L, 2L, 2L, 5L, 4L, 4L, 3L, 3L, 
5L, 4L, 3L, 3L, 3L, 3L, 4L, 4L, 3L, 4L, 5L, 3L, 4L, 3L, 3L, 4L, 
2L, 4L, 4L, 4L, 3L, 4L, 4L, 5L, 4L, 3L, 4L, 4L, 2L, 4L, 4L, 4L, 
3L, 3L, 4L, 3L, 4L, 5L, 3L, 4L, 3L, 5L, 2L, 3L, 2L, 5L, 5L, 2L, 
3L, 3L, 4L, 5L, 5L, 4L, 3L, 2L, 6L, 5L, 4L, 3L, 3L, 2L, 3L, 5L, 
3L, 4L, 4L, 3L, 2L, 2L, 4L, 5L, 2L, 3L, 3L, 2L, 5L, 3L, 3L, 3L, 
3L, 4L, 4L, 5L, 3L, 5L, 4L, 4L, 2L, 3L, 5L, 2L, 3L, 4L, 4L, 3L, 
5L, 5L, 3L, 2L, 5L, 4L, 4L, 4L, 2L, 2L, 5L, 3L, 4L), .Label = c("20-29", 
"30-39", "40-49", "50-59", "60-69", "70-79"), class = "factor"), 
    Menopause = structure(c(3L, 3L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 
    3L, 3L, 1L, 2L, 1L, 3L, 2L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 
    3L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 3L, 1L, 1L, 3L, 1L, 1L, 3L, 
    3L, 1L, 1L, 1L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 
    1L, 1L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 
    3L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 3L, 
    3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 3L, 3L, 
    3L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 3L, 
    1L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 
    3L, 3L, 3L, 1L, 1L, 1L, 3L, 3L, 1L, 3L, 1L, 3L, 3L, 1L, 1L, 
    3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 1L, 1L, 3L, 
    1L, 3L, 3L, 1L, 3L, 3L, 1L, 3L, 3L, 1L, 3L, 1L, 3L, 3L, 1L, 
    3L, 3L, 1L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 1L, 1L, 1L, 3L, 1L, 
    3L, 3L, 3L, 1L, 1L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 
    1L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 
    3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 
    2L, 3L, 3L, 1L, 3L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 
    1L, 1L, 3L, 3L, 1L, 1L, 1L), .Label = c("ge40", "lt40", "premeno"
    ), class = "factor"), Tumor_size = structure(c(6L, 4L, 4L, 
    3L, 1L, 3L, 5L, 4L, 11L, 4L, 1L, 5L, 2L, 5L, 6L, 6L, 3L, 
    6L, 6L, 6L, 8L, 3L, 5L, 8L, 7L, 5L, 4L, 5L, 8L, 6L, 8L, 3L, 
    2L, 2L, 2L, 6L, 1L, 3L, 2L, 6L, 4L, 5L, 10L, 2L, 11L, 6L, 
    5L, 5L, 4L, 4L, 3L, 4L, 3L, 4L, 8L, 8L, 1L, 10L, 6L, 3L, 
    4L, 2L, 1L, 7L, 5L, 2L, 5L, 4L, 7L, 11L, 2L, 5L, 4L, 3L, 
    10L, 2L, 2L, 5L, 5L, 5L, 2L, 2L, 3L, 3L, 4L, 7L, 5L, 1L, 
    4L, 8L, 1L, 4L, 5L, 4L, 2L, 6L, 6L, 3L, 6L, 5L, 4L, 6L, 5L, 
    4L, 2L, 6L, 4L, 8L, 6L, 6L, 5L, 3L, 4L, 2L, 7L, 4L, 3L, 4L, 
    2L, 3L, 4L, 3L, 8L, 6L, 2L, 2L, 6L, 5L, 5L, 7L, 7L, 8L, 6L, 
    8L, 6L, 4L, 8L, 10L, 8L, 6L, 8L, 4L, 2L, 9L, 9L, 5L, 11L, 
    6L, 4L, 6L, 5L, 6L, 7L, 3L, 3L, 8L, 5L, 6L, 6L, 7L, 5L, 6L, 
    2L, 5L, 5L, 4L, 4L, 8L, 2L, 6L, 4L, 3L, 6L, 4L, 5L, 6L, 5L, 
    2L, 5L, 4L, 7L, 7L, 5L, 6L, 6L, 4L, 5L, 3L, 2L, 4L, 3L, 5L, 
    6L, 2L, 11L, 7L, 2L, 2L, 3L, 5L, 5L, 3L, 8L, 7L, 5L, 1L, 
    6L, 5L, 6L, 7L, 4L, 4L, 6L, 5L, 8L, 4L, 4L, 3L, 6L, 3L, 5L, 
    6L, 5L, 4L, 5L, 4L, 6L, 6L, 8L, 9L, 11L, 6L, 6L, 3L, 6L, 
    5L, 5L, 5L, 7L, 4L, 4L, 3L, 5L, 4L, 6L, 6L, 3L, 6L, 7L, 4L, 
    5L, 11L, 8L, 11L, 6L, 6L, 6L, 4L, 6L, 6L, 5L, 5L, 5L, 4L, 
    4L, 7L, 6L, 4L, 7L, 5L, 6L, 5L, 3L, 6L, 6L, 5L, 5L, 2L, 7L, 
    8L, 8L, 6L, 4L, 4L, 6L, 6L), .Label = c("0-4", "10-14", "15-19", 
    "20-24", "25-29", "30-34", "35-39", "40-44", "45-49", "5-9", 
    "50-54"), class = "factor"), Inv_nodes = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 6L, 6L, 1L, 7L, 7L, 5L, 6L, 1L, 1L, 5L, 
    5L, 1L, 1L, 1L, 5L, 5L, 1L, 1L, 6L, 1L, 1L, 5L, 1L, 1L, 3L, 
    5L, 3L, 1L, 1L, 5L, 5L, 1L, 1L, 1L, 1L, 5L, 1L, 5L, 5L, 5L, 
    5L, 3L, 1L, 1L, 5L, 1L, 6L, 5L, 5L, 1L, 1L, 1L, 5L, 1L, 1L, 
    1L, 1L, 7L, 7L, 6L, 1L, 1L, 1L, 1L, 2L, 1L, 6L, 1L, 1L, 1L, 
    5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 
    3L, 1L, 5L, 1L, 7L, 5L, 5L, 7L, 1L, 5L, 1L, 1L, 1L, 5L, 5L, 
    3L, 6L, 5L, 2L, 7L, 6L, 7L, 6L, 5L, 1L, 1L, 1L, 1L, 1L, 6L, 
    1L, 5L, 6L, 5L, 5L, 2L, 1L, 1L, 1L, 7L, 5L, 4L, 1L, 1L, 6L, 
    1L, 1L, 1L, 5L, 7L, 6L, 6L, 3L, 6L, 6L, 1L, 1L, 1L, 5L, 5L
    ), .Label = c("0-2", "12-14", "15-17", "24-26", "3-5", "6-8", 
    "9-11"), class = "factor"), Node_caps = structure(c(2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 1L, 2L, 3L, 2L, 2L, 3L, 3L, 
    2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 1L, 1L, 2L, 2L, 
    3L, 2L, 2L, 3L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
    2L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 
    2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 3L, 
    2L, 3L, 2L, 3L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 3L, 2L, 3L, 
    2L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 2L, 
    3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 3L, 3L, 3L, 2L, 2L, 3L, 2L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L), .Label = c("?", 
    "no", "yes"), class = "factor"), Deg_malig = c(3L, 2L, 2L, 
    2L, 2L, 2L, 2L, 1L, 2L, 2L, 3L, 2L, 1L, 3L, 3L, 1L, 2L, 3L, 
    3L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 3L, 2L, 2L, 3L, 2L, 3L, 
    1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 
    1L, 1L, 2L, 2L, 1L, 3L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 
    2L, 1L, 1L, 1L, 3L, 3L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 
    2L, 2L, 2L, 1L, 2L, 2L, 1L, 3L, 2L, 1L, 3L, 1L, 2L, 3L, 2L, 
    2L, 1L, 2L, 2L, 2L, 1L, 2L, 3L, 3L, 2L, 2L, 2L, 1L, 2L, 2L, 
    3L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 3L, 1L, 1L, 1L, 2L, 3L, 
    1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 3L, 1L, 
    2L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 2L, 3L, 1L, 1L, 1L, 3L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 
    3L, 3L, 2L, 1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 
    2L, 1L, 3L, 2L, 1L, 2L, 2L, 2L, 3L, 2L, 3L, 1L, 2L, 2L, 3L, 
    1L, 2L, 2L, 2L, 2L, 3L, 1L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 1L, 2L, 2L, 3L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 
    3L, 3L, 2L, 3L, 3L, 3L, 2L, 3L, 2L, 1L, 3L, 3L, 3L, 1L, 2L, 
    2L, 3L, 2L, 3L, 3L, 1L, 1L, 3L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 
    2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 1L, 3L, 3L), Breast = structure(c(1L, 
    2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 
    2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 
    2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 
    2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 
    1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 
    1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 
    1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 
    2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 
    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 
    1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 
    2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 
    2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 
    1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 
    1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 
    1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L
    ), .Label = c("left", "right"), class = "factor"), Breast_quad = structure(c(3L, 
    6L, 3L, 4L, 5L, 3L, 3L, 3L, 3L, 4L, 2L, 3L, 6L, 6L, 4L, 3L, 
    3L, 3L, 3L, 6L, 3L, 3L, 3L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 4L, 
    3L, 3L, 4L, 4L, 4L, 2L, 2L, 3L, 3L, 3L, 3L, 2L, 4L, 6L, 4L, 
    3L, 4L, 6L, 3L, 3L, 5L, 3L, 4L, 4L, 6L, 2L, 6L, 4L, 4L, 2L, 
    5L, 3L, 6L, 5L, 4L, 5L, 4L, 3L, 3L, 3L, 4L, 4L, 5L, 5L, 3L, 
    3L, 2L, 3L, 2L, 3L, 4L, 3L, 3L, 5L, 4L, 3L, 5L, 4L, 4L, 2L, 
    4L, 4L, 4L, 3L, 5L, 4L, 4L, 6L, 3L, 3L, 3L, 5L, 5L, 3L, 4L, 
    4L, 6L, 6L, 4L, 3L, 2L, 4L, 4L, 6L, 4L, 3L, 4L, 3L, 5L, 3L, 
    6L, 4L, 3L, 3L, 2L, 6L, 4L, 4L, 4L, 6L, 4L, 4L, 6L, 3L, 2L, 
    6L, 3L, 3L, 5L, 3L, 3L, 4L, 3L, 2L, 5L, 4L, 3L, 2L, 4L, 4L, 
    3L, 3L, 4L, 4L, 4L, 4L, 2L, 2L, 3L, 4L, 3L, 4L, 4L, 3L, 4L, 
    3L, 4L, 4L, 4L, 4L, 3L, 6L, 4L, 3L, 6L, 3L, 3L, 4L, 3L, 4L, 
    3L, 3L, 4L, 3L, 3L, 5L, 4L, 4L, 4L, 5L, 4L, 3L, 5L, 4L, 4L, 
    4L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 6L, 2L, 1L, 6L, 6L, 4L, 3L, 
    2L, 6L, 4L, 3L, 4L, 4L, 4L, 2L, 3L, 6L, 4L, 5L, 3L, 3L, 3L, 
    3L, 4L, 3L, 6L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 3L, 3L, 6L, 3L, 
    3L, 3L, 6L, 4L, 4L, 3L, 5L, 3L, 3L, 4L, 3L, 4L, 4L, 6L, 4L, 
    3L, 3L, 5L, 4L, 6L, 5L, 4L, 4L, 3L, 3L, 6L, 3L, 3L, 3L, 5L, 
    3L, 4L, 6L, 2L, 4L, 5L, 4L, 6L, 3L, 3L, 4L, 4L, 4L, 3L, 3L
    ), .Label = c("?", "central", "left_low", "left_up", "right_low", 
    "right_up"), class = "factor"), Irradiate = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 
    2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 
    1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 
    1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 
    1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 
    1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 
    1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 
    2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L
    ), .Label = c("no", "yes"), class = "factor")), class = "data.frame", row.names = c(NA, 
-286L))
  • Can you copy the output of `dput(Bcdata)`, or at least of `dput(Bcdata$Tumor_size)`? – iago May 14 '21 at 11:32
  • Maybe you can trick the thing by doing this: Bcdata$Tumor_size=gsub('40-44','9',Bcdata$Tumor_size). Maybe the table will react differently to character and if you need to change afterward it's easy – elielink May 14 '21 at 11:33
  • Tried that and same thing happened. Still getting 414 instead of 9 – Student Work May 14 '21 at 11:42
  • 1
    @StudentWork as an answer has pointed out, "0-4" is being replaced by "1" in your first call to ```gsub```. A quick fix would be to run that ```gsub```at last – David May 14 '21 at 12:53

3 Answers3

4

'40-44' is being changed to '414' by the first gsub function, because it matches the middle part of the string:

Bcdata$Tumor_size=gsub('0-4',1,Bcdata$Tumor_size)

You should use a proper recoding function, or encode into a factor then use as.numeric to turn it into integer dummy values.

George Savva
  • 4,152
  • 1
  • 7
  • 21
  • $Tumor_size is already listed as a factor. What sort of recording function would you suggest? – Student Work May 14 '21 at 12:30
  • 1
    If it's already a factor, you could try `forecats::fct_recode()` https://forcats.tidyverse.org/reference/fct_recode.html – s_pike May 14 '21 at 12:43
  • Reversing the order of the ```gsub```'s would do the job, but it'd definitely be a bad practice – David May 14 '21 at 12:52
  • 1
    if it's already a factor and the levels are in the right order then `as.numeric` will turn it into an integer. – George Savva May 14 '21 at 13:09
2

If you want a really quick solution, you could just change the pattern to match exactly:

Bcdata$Tumor_size=gsub('^0-4$',1,Bcdata$Tumor_size)

reference: Match exact string

s_pike
  • 1,710
  • 1
  • 10
  • 22
2

Unless I'm missing something, you're working way harder than you have to.

In your data, Tumor_size is already a factor, with the levels in the correct order. Therefore, using as.numeric() will convert the strings to their corresponding numeric codes.

table(as.numeric(Bcdata$Tumor_size))

 1  2  3  4  5  6  7  8  9 10 11 
 8 28 30 50 54 60 19 22  3  4  8 
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453