-1

In my table below, I'm attempting to create a scatterplot between X3-X3.6 against the risk score to see if there is any correlation. I have all this data stored in a data frame.

I'm new to R. What is the best way to extract this data? I'm using plot and pairs and multiple lines of code to extract different columns and it's getting extremely messy.

Attempt

plot(cancerData[2], cancerData[3])

which corresponds to risk score col in Risk.Score and X3.GAL

This results in the following error:

Error in stripchart.default(x1, ...) : invalid plotting method

enter image description here

Ryan Shocker
  • 693
  • 1
  • 7
  • 22

1 Answers1

0

You can create a loop to create all of the different plots. If your data frame was named "data" for code for the loop would look something like this:

# iterate the loop for variables x3 to the end
for (variable in names(data)[3:ncol(data)]) {
  # plot rating versus the variable of the iteration
  # in order to reference the column you can paste the variable name to "data$" and evaluate the text
  plot( data$rating, eval(parse( text = paste0("data$", variable))) )
}

You could also perform the same loop but instead referring to column positions

for ( variable in 3:ncol(data) ) {
plot( data$rating, unlist(data[ ,variable]) )
}

You could also just use the corrplot library to create a correlation matrix with the data frame. Looping to create scatter plots is going to give you a lot of output. If you get rid of erroneous columns you could just do:

plot(data)

This will create a scatter plot matrix but the plots will be small with many variables so it would be tough to read.