Is there an R function for converting a Categorical Variable (in Character) to a Quantitative Variable?

Question

I have a categorical variable (in character structure) that is dummy coded in the following manner xx-xxxx. The first 2 digits of the dummy code are significant for categorizing the responses. I would like to be able to bin the responses according to these first 2 digits. For example, there are 28 responses dummy coded as 11-xxxx. I would like to combine all 28 of these responses into one response. I would, therefore, like to be able to convert the dummy coded categorical variable to a quantitative variable so I can more efficiently bin the responses according to these first 2 digits. Is there an R function for making this conversion?

Image of the Frequency Distribution of the first few responses for the variable

I am a beginner coder and this is my first time posting to stack overflow. Thank you for your help!

dput(data$H4LM18) Sample

If you're using the tidyverse, [this answer](https://stackoverflow.com/a/44424567/10898875) illustrates a neat way to make a new column from the first two digits of the dummy code. (The other answers to that question have other options you could explore.) — A. S. K., Dec 04 '19 at 20:31
Thanks for your input! I am looking to bin the responses so I can graph them so I'm afraid organizing them into columns won't be sufficient. — user12481858, Dec 04 '19 at 21:31
Do you want the final data frame to have one row per first two digits of the code (`11`, etc.)? Or are you looking for a column that encodes which "bin" the row goes in so that you can process the data frame more efficiently downstream? — A. S. K., Dec 04 '19 at 21:47
I would like the final data frame to have one row per first two digits of the code. — user12481858, Dec 05 '19 at 21:03
That's helpful. Can you post a sample of your data, using `dput`? — A. S. K., Dec 05 '19 at 23:29
I just added a picture of a sample of my data to the original post. — user12481858, Dec 06 '19 at 03:53
I doubt that blurring the distinction between categorical and numerical variables is a reliable way to group categorical variables. This sounds like an [XY problem](https://meta.stackexchange.com/q/66377/357835). — John Coleman, Dec 06 '19 at 03:56

score 0 · Answer 1 · answered Dec 06 '19 at 04:08

I was able to receive help from a Help Desk and we successfully binned the variable according to the first two digits of the dummy code.

Here is code used for the dataset data and the variable H4LM18:

data$jobcategory<-data$H4LM18

data$jobbracket <- unlist(lapply((strsplit(data$jobcategory, "-")),function(x){x[1]}))#[c(T, F)]

By splitting the dummy code of the responses at the dash ("-") we were able to categorize the responses according to the first two digits of the dummy code alone.

Is there an R function for converting a Categorical Variable (in Character) to a Quantitative Variable?

1 Answers1