-3

I have a data and its dput is given below.

Data:

dput(data)
structure(c(12L, 2L, 14L, 2L, 2L, 12L, 14L, 13L, 14L, 12L), .Label = c("0 Ã  10 cm", 
"10 Ã  20 cm", "100 Ã  110 cm", "110 Ã  120 cm", "120 Ã  130 cm", 
"130 Ã  140 cm", "140 Ã  150 cm", "150 Ã  160 cm", "160 Ã  170 cm", 
"170 Ã  180 cm", "180 Ã  190 cm", "20 Ã  30 cm", "30 Ã  40 cm", 
"40 Ã  50 cm", "50 Ã  60 cm", "60 Ã  70 cm", "70 Ã  80 cm", "80 Ã  90 cm", 
"90 Ã  100 cm", "N/A"), class = "factor")

The data is french, basically, it is categories from 0 to 10, from 10 to 20 and so on. I need to replace the second column with numbers, for example, 0 Ã 10 cm I need to assign to 1, 10 Ã 20 cm - to 2 and so on, how can I automatically replace values with the number based on the category in R?

In the new table, I need to get column 2 replacing the A with the category levels and the column 2 will have values as given below.:

2 1 4 1 1 2 4 3 4 2
Sathish
  • 12,453
  • 3
  • 41
  • 59
L.Bond
  • 37
  • 6
  • 3
    It is difficult to understand your data structure. Please provide an example dataset. – www Mar 18 '17 at 21:18
  • I need to assign the number based on text for example from 0 to 10 it should be 1, from 20 to 30 - should be 2 and so on – L.Bond Mar 18 '17 at 21:31
  • `as.numeric(as.factor())` ought to do. – Gregor Thomas Mar 18 '17 at 22:18
  • Please have a look at [how to make a reproducible example](http://stackoverflow.com/q/5963269/903061). Your goal is clear. But there is a problem that we don't know how your data look - is it a vector? a data frame? Some other structure? What class is it? Is it in one column or multiple columns? We can see that Sathish assumes you have 4 columns, but your comment to the answer suggests you have just 1 column. These questions can be answered if you follow the [reproducible example advice here](http://stackoverflow.com/q/5963269/903061). – Gregor Thomas Mar 18 '17 at 22:20

1 Answers1

1

First, you have to tidy your data to clean structure. I did it in the data section. Then convert column one to factor and then to numeric, to get the category levels, and assign it to column 2.

df[, 2] <- as.numeric( factor( df[[1]]) )

#    X1 X2    X3
# 1  20  2 30 cm
# 2  10  1 20 cm
# 3  40  4 50 cm
# 4  10  1 20 cm
# 5  10  1 20 cm
# 6  20  2 30 cm
# 7  40  4 50 cm
# 8  30  3 40 cm
# 9  40  4 50 cm
# 10 20  2 30 cm

Data:

df <- structure(c(12L, 2L, 14L, 2L, 2L, 12L, 14L, 13L, 14L, 12L),
                .Label = c("0 Ã  10 cm", "10 Ã  20 cm", "100 Ã  110 cm", "110 Ã  120 cm", "120 Ã  130 cm", 
                           "130 Ã  140 cm", "140 Ã  150 cm", "150 Ã  160 cm", "160 Ã  170 cm", 
                           "170 Ã  180 cm", "180 Ã  190 cm", "20 Ã  30 cm", "30 Ã  40 cm", 
                           "40 Ã  50 cm", "50 Ã  60 cm", "60 Ã  70 cm", "70 Ã  80 cm", "80 Ã  90 cm", 
                           "90 Ã  100 cm", "N/A"), class = "factor")

Tidy your data:

df <- as.character( df )  # convert factor to character
df <- data.frame( do.call('rbind', strsplit( df, "\ ") ), stringsAsFactors = FALSE )  # split string by spaces and row bind them together
df$X3 <- paste( df$X4, df$X5, sep = ' ')   # combine column 4 & 5 together and assign it to column 3
df[, c('X4', 'X5')] <- NULL  # remove column 4 and 5
df$X1 <- as.numeric( df$X1)  # convert column 1 to numeric
df                           # structure of data 
#    X1 X2    X3
# 1  20  Ã 30 cm
# 2  10  Ã 20 cm
# 3  40  Ã 50 cm
# 4  10  Ã 20 cm
# 5  10  Ã 20 cm
# 6  20  Ã 30 cm
# 7  40  Ã 50 cm
# 8  30  Ã 40 cm
# 9  40  Ã 50 cm
# 10 20  Ã 30 cm
Sathish
  • 12,453
  • 3
  • 41
  • 59
  • this might work, and one more question - how can I spread the data into (0 Ã 10 cm) into V1 V2 V3 and so on? – L.Bond Mar 18 '17 at 21:55
  • Thank you, Sathish! I will try it now! – L.Bond Mar 18 '17 at 22:00
  • @L.Bond `cut` command is the right one for this task, so I removed the previous two approaches. – Sathish Mar 18 '17 at 22:17
  • Thank you! But I still have a problem - all the values "0 Ã 10 cm" is in the one column, is there a way to break in into different colums? – L.Bond Mar 18 '17 at 22:22
  • please use this command `dput(data)` and post the output of it in your question. It will help people to understand the structure of your data. Please see Gregor's comment on reproducible example. – Sathish Mar 18 '17 at 22:24
  • > dput(c) structure(c(12L, 2L, 14L, 2L, 2L, 12L, 14L, 13L, 14L, 12L), .Label = c("0 Ã 10 cm", "10 Ã 20 cm", "100 Ã 110 cm", "110 Ã 120 cm", "120 Ã 130 cm", "130 Ã 140 cm", "140 Ã 150 cm", "150 Ã 160 cm", "160 Ã 170 cm", "170 Ã 180 cm", "180 Ã 190 cm", "20 Ã 30 cm", "30 Ã 40 cm", "40 Ã 50 cm", "50 Ã 60 cm", "60 Ã 70 cm", "70 Ã 80 cm", "80 Ã 90 cm", "90 Ã 100 cm", "N/A"), class = "factor") – L.Bond Mar 18 '17 at 23:11
  • I beg my pardon the question was not clear – L.Bond Mar 18 '17 at 23:12