2

I recently tried to plot a regression pane in RStudio using the plotly library and read this post: Add Regression Plane to 3d Scatter Plot in Plotly

I followed the exact same procedure and ended up with a regression plane, which is obviously not correct:

EDIT: I followed the proposal in the first answer and my result looks like this: Regression Pane

Here is my code, I commented every step: sm is the data.frame I used

library(reshape2);
sm <- read.delim("Supermodel.dat", header = TRUE);
x1 <- sm$age
x2 <- sm$years
y <- sm$salary
df <- data.frame(x1, x2, y);


### Estimation of the regression plane
mod <- lm(y ~ x1+x2, data = df, na.action =     
        na.omit);

cf.mod <- coef(mod)

### Calculate z on a grid of x-y values
x1.seq <- seq(min(x1),max(x1),length.out=231)
x2.seq <- seq(min(x2),max(x2),length.out=231)
z.mtx <- t(outer(x1.seq, x2.seq, function(x1,x2) 
  cf.mod[1]+cf.mod[2]*x1+cf.mod[3]*x2))


#### Draw the plane with "plot_ly" and add points with "add_trace"
library(plotly) 

# Draw plane with plotly surface plot
plane <- plot_ly(x=~x1.seq, y=~x2.seq, z=~z.mtx, colors = 
c("#f5cb11", #b31d83"),type="surface") %>%
  add_trace(data=df, x=x1, y=x2, z=y, mode="markers", 
        type="scatter3d", 
        marker = list(color="black", opacity=0.7, symbol=105)) %>%
  layout(scene = list(aspectmode = "manual", aspectratio = list(x=1, 
    y=1, z=1), xaxis = list(title = "Age", range = c(12,24)), yaxis = 
    list(title = "Work experience (years)", range = c(0,10)), zaxis = 
list(title = "Salary p.a. (k)", range = c(0,90) )))

plane

I checked with the str-function, if the x1.seq and x2.seq have the same number of entries, and they both have 231 number values in them. The plane gets calculated and is shown, but it obviously still wrong.

PS: If you want to run the code, just download the file Supermodel.dat from Andy Fields website (https://studysites.uk.sagepub.com/dsur/study/articles.htm) under Regression.

Thanks in advance, rikojir

rikojir
  • 115
  • 2
  • 13

1 Answers1

3

Here is an illustrative example that shows how the observed points and the regression plane can be plotted together in a 3D plot generated using the plotlty package.
Hope it can help you.

### Data generating process
set.seed(1234)
n <- 50
x1 <- runif(n); x2 <- runif(n)
x3 <- rnorm(n)>0.5
y <- 2*x1-x2+rnorm(n, sd=0.25)
df <- data.frame(y, x1, x2, x3)

### Estimation of the regression plane
mod <- lm(y ~ x1+x2)
cf.mod <- coef(mod)

### Calculate z on a grid of x-y values
x1.seq <- seq(min(x1),max(x1),length.out=25)
x2.seq <- seq(min(x2),max(x2),length.out=25)
z <- t(outer(x1.seq, x2.seq, function(x,y) cf.mod[1]+cf.mod[2]*x+cf.mod[3]*y))

#### Draw the plane with "plot_ly" and add points with "add_trace"
cols <- c("#f5cb11", "#b31d83")
cols <- cols[x3+1] 
library(plotly)
p <- plot_ly(x=~x1.seq, y=~x2.seq, z=~z,
  colors = c("#f5cb11", "#b31d83"),type="surface") %>%
  add_trace(data=df, x=x1, y=x2, z=y, mode="markers", type="scatter3d",
  marker = list(color=cols, opacity=0.7, symbol=105)) %>%
  layout(scene = list(
    aspectmode = "manual", aspectratio = list(x=1, y=1, z=1),
    xaxis = list(title = "X1", range = c(0,1)),
    yaxis = list(title = "X2", range = c(0,1)),
    zaxis = list(title = "Y", range = pretty(z)[c(1,8)])))
print(p)

Here is the 3D plot generated by the above code:

enter image description here

Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
  • Thanks for your help, I tried to follow your approach, now I see the plane, but it is still placed wrong inside the points. I edited my post, hope you have an idea what is still wrong. – rikojir Jun 02 '17 at 13:31
  • I am sure you know that you should not edit the wrong code in question to correct one. Then other readers do not know where was the error. You should have pointed out the error or put correct code in your own answer. – rnso Jun 02 '17 at 17:48
  • @rnso I agree. Sorry. We will rewrite the question in the original form. – Marco Sandri Jun 02 '17 at 20:01
  • I tried your solution Marco, and now the plane has changed (fits the data better), but it still is wrong. I edited my post and changed the code. – rikojir Jun 06 '17 at 07:28
  • @MarcoSandri I already linked the file in the original post. You can find it here: https://studysites.uk.sagepub.com/dsur/study/articles.htm Just open the regression tab, there you can find it. Thanks for your effort! The weird thing is: I tried your solution with another dataset and there the plane looks fine, but here it is somehow messed up! – rikojir Jun 06 '17 at 09:52
  • 1
    @rikojir I do not find any problem with your 3D plot. The regression plane fit bad the observed points not for a problem of the R code but because a linear regression is not a good model for your data. See the R-squared of `mod`: 0.178. It is rather low. – Marco Sandri Jun 06 '17 at 10:18
  • Okay, that sounds right. Guess I was too blinded by the plot, but the model is simply not good. Thanks for your help! – rikojir Jun 09 '17 at 12:52