-1

I am working with OJdata set in ISLR package. I need to add to columns to the data frame. One column is a product of two numerical variable. The second column is a product of numerical and categorical variables .

I added the first column (numerical*numerical) using mutate function in dplyr package in R as follows,

require(ISLR)
OJ %>% 
  mutate(`StoreID:PriceCH` = StoreID*PriceCH)

And i was able to add this coulmn sucessfully. But when i tried to do the same when adding the categorical*numeric column i am getting an error.

OJ %>% 
  mutate(`Store7:PriceCH` = Store7*PriceCH)

Warning message:
In Ops.factor(Store7, PriceCH) : ‘*’ not meaningful for factors 

Can anyone suggest what i can do if i need to add coulmn which is a product of categorical*numerical ?

My output should be something like this,

enter image description here

Thank you

student_R123
  • 962
  • 11
  • 30
  • Possible duplicate of [R error "sum not meaningful for factors"](https://stackoverflow.com/questions/18045096/r-error-sum-not-meaningful-for-factors) – jogo Sep 18 '19 at 14:07
  • I didnt use images to show the data. I used the image to show my expected output . – student_R123 Sep 18 '19 at 14:16
  • 1
    What do you intend to do to turn a categorical variable into numeric, or otherwise multiply something by a category? – camille Sep 18 '19 at 15:08

3 Answers3

0

Apply one-hot encoding to Store7 first:

OJ <- cbind(OJ, sapply("Yes", function(x) as.integer(x == OJ$Store7)))
names(OJ)[ncol(OJ)] <- "Store7_Yes"
slava-kohut
  • 4,203
  • 1
  • 7
  • 24
0

Conceptually, I does not make a lot of sense (in most of the cases) multiply categorical variables.

Thought if you want to do so, you have to transform your data to a numeric scale. Be aware that this is not always so straightfoward.

This could be a starting point:

library(tidyverse)

Result <- OJ %>% 
  mutate(`StoreID:PriceCH` = StoreID*PriceCH) %>% 
  mutate(Store7Numeric = as.character(Store7)) #To avoid possible mistakes

Result <- Result %>% 
  mutate(Store7Numeric = ifelse(Store7Numeric == "No", 0, 1)) #Check this

Result <- Result %>% mutate(Store7Numeric = as.numeric(Store7Numeric)) %>% #To numeric
mutate(`Store7:PriceCH` = Store7Numeric*PriceCH) %>% #Your calculation
select(-Store7Numeric) #Remove, if you want. the created numeric variable
Orlando Sabogal
  • 1,470
  • 7
  • 20
0

The error message is due to variable Store7 being a factor (See in str(OJ)), so you must make it numeric:

OJ$Store7 <- as.numeric(OJ$Store7)