-1

I am new to the R language, for my assignment, I am trying to generate several levels dummies for different variables(total in 3). however, each approach i got problem:

method1: followed by https://stats.idre.ucla.edu/r/modules/coding-for-categorical-variables-in-regression-models/ The code:

> housing_prices2$Fuel.Type.f <- factor(housing_prices2$Fuel.Type)
> is.factor(housing_prices2$Fuel.Type.f)
[1] TRUE
> housing_prices2$Fuel.Type.f[1:10]
 [1] Electric Gas      Gas      Gas      Gas      Gas      Oil     
 [8] Oil      Electric Gas     
Levels: Electric Gas None Oil Solar Unknown/Other Wood

works well. However, when I got problem in next line:

> summary(lm(write ~ Fuel.Type.f, data = housing_prices2))  
Error in model.frame.default(formula = write ~ Fuel.Type.f, data = housing_prices2,:          object is not a matrix

I just have no idea about this error and it doesn't make sense to me, so I decided to use another method;

method2: followed by Convert categorical variables to numeric in R

for variable Fuel.Type, it works well:

> Fuel.Type <- as.factor(c("Electric", "Gas", "None", "Oil", "Solar", "Unknown/Other",
+                          "Wood"))
> Fuel.Type
[1] Electric      Gas           None          Oil           Solar        
[6] Unknown/Other Wood         
Levels: Electric Gas None Oil Solar Unknown/Other Wood
> unclass(Fuel.Type)
[1] 1 2 3 4 5 6 7
attr(,"levels")
[1] "Electric"      "Gas"           "None"          "Oil"          
[5] "Solar"         "Unknown/Other" "Wood"         

but when I try to generate dummies for other variables, then i got this error:

> housing_prices2$Heat.Type.f[1:10]
NULL
Warning message:
Unknown or uninitialised column: 'Heat.Type.f'. 

I have clueless about what's going on about these error either... any suggestions are appreciated!

BTW, here is my sample data table:

>$ Fuel.Type    : chr  "Electric" "Gas" "Gas" "Gas"

>$ Heat.Type    : chr  "Electric" "Hot Water" "Hot Water" "Hot Air"

>$ Sewer.Type   : chr  "Private" "Private" "Public" "Private"
Stan
  • 480
  • 1
  • 5
  • 18
Steve Shi
  • 15
  • 5

1 Answers1

0

I figured out my problem last night. The problem is that I messed up the datafile, since i create a new data file named:

hp2 <- read_excel("Desktop/hw/424/hw1/housing_prices2.xlsx")

In addition, I messed up the Y variable as well, see

summary(lm(write ~ Fuel.Type.f, data = housing_prices2))  

My Y variable actually is not write.

Stan
  • 480
  • 1
  • 5
  • 18
Steve Shi
  • 15
  • 5