I think, if you'll try to follow this simple example, it might, at least, help you to solve your real problem.
We have to start from dummy data set preparation (please read how to make a minimal reproducible example):
Make a treatment
data set:
library(tidyverse)
set.seed(56154455)
treatment <- data.frame(
geneName = LETTERS,
cts = sample(0:1000, 26)
)
head(treatment)
# geneName cts
# 1 A 834
# 2 B 860
# 3 C 950
# 4 D 302
# 5 E 979
# 6 F 159
Make a control
data set:
set.seed(56154455)
control <- treatment[sample(1:26, 26), ]
control[, 1] <- treatment[, 1]
head(control)
# geneName cts
# 3 A 950
# 23 B 41
# 15 C 889
# 20 D 629
# 14 E 398
# 4 F 302
Join both treatment
and control
by geneName
cts <- full_join(treatment, control, by = 'geneName') %>%
rename('treatment' = cts.x, 'control' = cts.y) %>%
column_to_rownames('geneName') %>%
as.matrix
head(cts)
# treatment control
# A 331 737
# B 914 676
# C 161 161
# D 592 769
# E 946 74
# F 813 314
Prepare your coldata
table
Remember, this is just a dummy example, so your real coldata
, might include any number of columns, which reflects the design of your experiment. However, the number of rows in your coldata
, has to be equal to the number of columns in your experimental data (here it is cts
). Please read the documentation for SummarizedExperiment class, where you can find detailed explanation. Another great resource is the Rafa's book
coldata <- matrix(c("DMSO", "1xPBS"), dimnames = list(colnames(cts), 'treatment'))
coldata
# treatment
# treatment "DMSO"
# control "1xPBS"
Finally, create your DESeqDataSet
:
dds <- DESeq2::DESeqDataSetFromMatrix(
countData = cts,
colData = coldata,
design = ~treatment
)
Where:
countData
is your experimental data, prepared as above;
colData
is your coldata
matrix, with experimental metadata;
~treatment
is the formula, describing the experimental model you test in your experiment. It could be anything like ~ treatment + sex * age
etc.
☠
dds
# class: DESeqDataSet
# dim: 26 2
# metadata(1): version
# assays(1): counts
# rownames(26): A B ... Y Z
# rowData names(0):
# colnames(2): treatment control
# colData names(1): treatment