0

I want to train the models using different algorithms. For instance, this work:

dd=read.arff("china.arff")
model=lm(Effort~ ., data=dd)
fitted(model)

But the following code gives NULL for the same dataset

install.packages("neuralnet")
library(neuralnet)
model=neuralnet(Effort~N_effort+Duration, data=dd, 
                   hidden=1,err.fct="ce", linear.output=FALSE)
fitted(model)

//Gives NULL

Similar result is shown with randomForest model

It is not possible that these models have no errors, so what should be the problem?

structure(list(Output = c(150, 98, 27, 60, 69, 19, 14, 17, 64, 
60, 27, 17, 41, 40, 12, 38, 57, 20, 66, 112, 28, 68, 15, 15), 
    Inquiry = c(75, 70, 0, 20, 1, 0, 0, 15, 14, 20, 29, 8, 16, 
    20, 13, 24, 12, 24, 13, 21, 4, 0, 6, 0), RawFPcounts = c(1750, 
    1902, 535, 660, 478.89, 377.33, 256.25, 262.73, 715.79, 690.43, 
    465.45, 298.67, 490.59, 802.35, 220, 487.62, 550.91, 363.64, 
    1073.91, 1310, 476.19, 694, 189.52, 273.68), AdjFP = c(1750, 
    1902, 428, 759, 431, 283, 205, 289, 680, 794, 512, 224, 417, 
    682, 209, 512, 606, 400, 1235, 1572, 500, 694, 199, 260), 
    Effort = c(102.4, 105.2, 11.1, 21.1, 28.8, 10, 8, 4.9, 12.9, 
    19, 10.8, 2.9, 7.5, 12, 4.1, 15.8, 18.3, 8.9, 38.1, 61.2, 
    3.6, 11.8, 0.5, 6.1)), class = "data.frame", row.names = c(NA, 
-24L))
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Khan
  • 11
  • 1
  • can you please provide your data using dput(). – Hunaidkhan Dec 31 '18 at 10:28
  • @Hunaidkhan, Im sorry I dont know about dput.. My dataset is about software effort estimation, which have several input features and "Effort" as output. It has 499 instances – Khan Dec 31 '18 at 10:32
  • @Hunaidkhan, I got the data using dput() and it is a long list but it has 0 values.. So is the problem due to 0 values?? – Khan Dec 31 '18 at 10:35
  • 2
    What do you mean you don't know about dput? Just type `?dput` in the console and start reading. In addition, there's a [whole set of answers](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) that will help you to convey enough information for us to at least make an educated guess. – Roman Luštrik Dec 31 '18 at 10:48
  • I edited the question and included the details of dput() – Khan Dec 31 '18 at 11:02

1 Answers1

1

To the best of my knowledge, fitted is not widely used in R (except perhaps in the context of GLM models); to be honest, I had never heard of the function before (and I have been programming in R for ~ 7 years now).

So, it should not come as a surprise that, outside the context of (generalized) linear models, i.e. in models like neural networks or random forests, the method is actually not implemented, and it simply returns NULL.

The good news may come from asking yourself why exactly you would like to use fitted? Because, in practice and broadly speaking, fitted is roughly equivalent to predict, at least for simple linear models:

df <- data.frame(income=c(5,3,47,8,6,5),
               won=c(0,0,1,1,1,0),
               age=c(18,18,23,50,19,39),
               home=c(0,0,1,0,0,1))

md1 <- lm(income ~ age + home, data=df) # linear model



fitted(md1)
        1         2         3         4         5         6 
 7.893273  7.893273 28.320749 -1.389725  7.603179 23.679251 

predict(md1)
        1         2         3         4         5         6 
 7.893273  7.893273 28.320749 -1.389725  7.603179 23.679251 

while in the case of GLMs you just need to specify type='response' when predicting, in order for the two functions to again return practically identical results:

md2 <- glm(factor(won) ~ age + home, data=df, family=binomial(link="logit")) #glm

fitted(md2)
        1         2         3         4         5         6 
0.4208590 0.4208590 0.4193888 0.7274819 0.4308001 0.5806112 

predict(md2)
         1          2          3          4          5          6 
-0.3192480 -0.3192480 -0.3252830  0.9818840 -0.2785876  0.3252830 

predict(md2, type='response')
        1         2         3         4         5         6 
0.4208590 0.4208590 0.4193888 0.7274819 0.4308001 0.5806112 

So, while fitted for a random forest model gives indeed NULL:

library(randomForest)
rf <- randomForest(income ~ age + home, data=df)
fitted(rf)
NULL

you can arguably get your required results simply with predict:

predict(rf)
        1         2         3         4         5         6 
 9.748170 11.463800  5.186755 13.905696  8.791710 29.000931 

The following threads might also be useful:

Is there a difference between the R functions fitted() and predict()?

Finding the fitted and predicted values for a statistical model

desertnaut
  • 57,590
  • 26
  • 140
  • 166