0

Hello I'm trying to use the abline to make a line in my scatterplot, I've tried a few different methods, but I'm not too sure what I'm doing wrong! (Fairly new to R)

enter image description here

edit: The code that I have also tried

plot(data$GRE.Score, data$Chance.of.Admit, main = "Regression Line plot", 
     xlab = "Chance of Admit", ylab = "GRE Score", 
     pch = 19, frame = FALSE)

abline(lm(GRE.Score ~ Chance.of.Admit, data = data), col = "red")
Phil
  • 7,287
  • 3
  • 36
  • 66
IJA
  • 35
  • 8
  • http://www.sthda.com/english/wiki/scatter-plots-r-base-graphs – DPH Apr 14 '21 at 00:15
  • I've tried it from that same page as well, however, I get: Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet – IJA Apr 14 '21 at 00:19

1 Answers1

0

I hope this small examples guides you the way:

# dummy data
df <- data.frame(x = 1:100, y = rpois(100, lambda = 4))

# plot
plot(df$x, df$y, main = "Main title",
     xlab = "X axis title", ylab = "Y axis title",
     pch = 19, frame = FALSE)

# linear regression line (can only be called AFTER the plot call)
abline(lm(y ~ x, data = df), col = "blue")
# vertical line that cuts X at 10
abline(v = 10, col = "red")
# horizontal line that cuts Y at 10
abline(h = 10, col = "green")

With the kaggle data:

df <- read.csv("C:/.../Admission_Predict.csv")
# plot
plot(df$GRE.Score, df$Chance.of.Admit, main = "Main title",
     xlab = "X axis title", ylab= "Y axis title",
     pch = 19, frame = FALSE)
abline(lm(Chance.of.Admit ~ GRE.Score, data = df), col = "red")
DPH
  • 4,244
  • 1
  • 8
  • 18
  • Hello This is my code; I've tried it the way you have it laid out as well x <- data$Chance.of.Admit y <- data$GRE.Score plot(x, y, main = "Regression Line plot", xlab = "Chance of Admit", ylab = "GRE Score", pch = 19, frame = FALSE) abline(lm(y ~ x, data = data), col = "red") But it is still giving me the same error "Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet" I apologize, I'm very new to this so it may be something that I'm not understanding! – IJA Apr 14 '21 at 00:37
  • I guess data is not a good identifier for you vairable... use df instead... I just ran this code (ours slightly modified on the dummy data): plot(df$x, df$y, main = "Regression Line plot", xlab = "Chance of Admit", ylab = "GRE Score", pch = 19, frame = FALSE) abline(lm(y ~ x, data = df), col = "red") – DPH Apr 14 '21 at 00:40
  • Unfortunately now it is giving me "Error in model.frame.default(formula = y ~ x, data = df, drop.unused.levels = TRUE) : 'data' must be a data.frame, environment, or list" – IJA Apr 14 '21 at 00:43
  • I guess there is no way arround a reproduceable example (the bug should be somewhere in your data but I can only guess without beeing able to reproduce): https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – DPH Apr 14 '21 at 00:47
  • Yeah, it's very strange, it works completely fine without the abline(), but the moment I try to run them in sequence including the abline() something goes wrong! Thank you so much for trying to help! I'll just move past this one for now I suppose! – IJA Apr 14 '21 at 01:00
  • lets not give up that easy: Are both columns of type numeric? And are there no missing values (meaning only complete rows)? Call str(data) or str(df) and tell me what datatype the columns are – DPH Apr 14 '21 at 01:01
  • Yes! Int/Num, and there's no missing data because it's actually a beginners data set from kaggle! (https://www.kaggle.com/mohansacharya/graduate-admissions), I've loaded the dataset as data <- read.csv because that's what was specified for me, so I don't think the name should be causing me any problems – IJA Apr 14 '21 at 01:05
  • I just altered my answer using read.csv on the file from kaggle, plot it and draw the regression line... this should work fine for you, as long as you put in your filepath – DPH Apr 14 '21 at 01:14
  • So I have a question because I'm trying to understand what I'm doing wrong. In my initial read data, I set the path as: data <- read.csv("Admission_Predict.csv") I didn't path it to any where on my computer but it works. Would that be the problem then? – IJA Apr 14 '21 at 01:26
  • @Isha if you do not inform the path, R uses the working directory currently set (which "by chance" can just be the location of your needed file) ... you can check this by calling getwd() and alter it by using setwd("C:/.../...") for example – DPH Apr 14 '21 at 01:29
  • Thank you so much for all your help! I really appreciate it – IJA Apr 14 '21 at 01:37