0

I'm pretty new at R and coding so I don't know how to explain it well on this site but I couldn't find a better forum to ask.
Basically I have a 6x6 matrix with each row being a discrete gene and each column being a sample.
I want the genes as the x-axis and the y-axis being the values of the samples, so that each gene will have its 6 samples above at their respective value.
I have this matrix in Excel and when I highlight it and plot it it gives me exactly what I want.
But trying to reduplicate it in R gives me a giant lattice plot at best.

I've tried boxplot(), scatterchart(), plot(), and ggplot().
I'm assuming I have to alter my matrix but I don't know how.

B. Go
  • 1,436
  • 4
  • 15
  • 22
  • Try `plot(x = rep(1:length(data$y1), times = 6), y = data$y1)` where data is the name of the matrix or data frame, and y1 being the values... without knowing what your data looks like there is little more help that can be given. From the description of your data (6×6) you need to make it long format first (1×36) and times = 6 may need to be each = 6 depending how you make it long – rg255 Jun 18 '19 at 22:34
  • This question would be easier to answer with some [example data](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). But it sounds like you need to convert from "wide" to "long" data (lots of questions + answers here cover that) and that `geom_jitter` from `ggplot2` would be a suitable visualization. – neilfws Jun 18 '19 at 22:50

2 Answers2

1

this may help:

library(tidyverse)


gene <- c("a", "b", "c", "d", "e", "f")
x1 <- c(1,2,3,4,5,6)
x2 <- c(2,3,4,5,-6,7)
x3 <- c(3,4,5,6,7,8)
x4 <- c(4,-5,6,7,8,9)
x5 <- c(9,8,7,6,5,4)
x6 <- c(5,4,3,2,-1,0)

df <- data.frame(gene, x1, x2, x3, x4, x5, x6) #creates data.frame
as_tibble(df) # convenient way to check data.frame values and column format types
df <- df %>% gather(sample, observation, 2:7) # here's the conversion to long format
as_tibble(df)  #watch df change

#example plots

p1 <- ggplot(df, aes(x = gene, y = observation, color = sample)) + geom_point()
p1
p2 <- ggplot(df, aes(x = gene, y = observation, group = sample, color = sample)) +
                 geom_line()
p2
p3 <- p2 + geom_point()
p3
dbo
  • 1,174
  • 1
  • 11
  • 19
0

This is very easy to solve - if your matrix is 6x6 with one gene per row and one observation per column (thus six observations per gene) you first need to make it long format (36 rows) - with such a simple format this can be done using unlist - and then plotting that against a vector of numbers for representing the genes:

# Here I make some dummy data - a 6x6 matrix of random numbers:
df1 <- matrix(rnorm(36,0,1), ncol = 6)

# To help show which way the data unlists, and make the 
#   genes different, I add 4 to gene 1:
df1[1,] <- df1[1,] + 4

#### TL;DR - HERE IS THE SOULTION ####
# Then plot it, using rep to make the x-axis data vector
plot(x = rep(1:6, times = 6), y = unlist(df1))

To improve the readability add axis labels:

# With axis labels
plot(x = rep(1:6, times = 6), y = unlist(df1), 
  xlab = 'Gene', ylab = 'Value')

enter image description here

You could also used ggplot with the geom_point aesthetic or geom_jitter - e.g:

ggplot() +
  geom_jitter(mapping = aes(x = rep(1:6, times = 6), y = as.numeric(unlist(data.frame(df1)))))

enter image description here

Note that you can also create a "jitter" effect in base R using rnorm() on the x values, tweaking the amount of jittering with the last argument of the rnorm() function:

plot(x = rep(1:6, times = 6) + rnorm(36, 0, 0.05), y = unlist(df1), xlab = 'Gene', ylab = 'Value')

enter image description here

rg255
  • 4,119
  • 3
  • 22
  • 40