I am attempting to perform miRNA correlation using the package anamiR in R (by way of Rstudio). The script I am using is:
library(anamiR)
mrna1 = read.csv("D:\\file1.csv", row.names = 1, header= TRUE)
mrna <- as.matrix(mrna1)
rm(mrna1)
mirna1 = read.csv("D:\\file2.csv", row.names = 1, header= TRUE)
mirna <- as.matrix(mirna1)
rm(mirna1)
pheno.mirna1 = read.csv("D:\\file3.csv", row.names = 1, header= TRUE)
pheno.mirna <- as.matrix(pheno.mirna1)
rm(pheno.mirna1)
pheno.mrna1 = read.csv("D:\\file4.csv", row.names = 1, header= TRUE)
pheno.mrna <- as.matrix(pheno.mrna1)
rm(pheno.mrna1)
mrna_se <- SummarizedExperiment::SummarizedExperiment(
assays = S4Vectors::SimpleList(counts=mrna),
colData = pheno.mrna)
mirna_se <- SummarizedExperiment::SummarizedExperiment(
assays = S4Vectors::SimpleList(counts=mirna),
colData = pheno.mirna)
mrna_d <- differExp_discrete(se = mrna_se,
class = "ER", method = "DESeq",
t_test.var = FALSE, log2 = FALSE,
p_value.cutoff = 0.05, logratio = 0.5
)
mirna_d <- differExp_discrete(se = mirna_se,
class = "ER", method = "DESeq",
t_test.var = FALSE, log2 = FALSE,
p_value.cutoff = 0.05, logratio = 0.5
)
When I reach (This is the code that generates the error).
mrna_d <- differExp_discrete(se = mrna_se,
class = "ER", method = "DESeq",
t_test.var = FALSE, log2 = FALSE,
p_value.cutoff = 0.05, logratio = 0.5
)
mirna_d <- differExp_discrete(se = mirna_se,
class = "ER", method = "DESeq",
t_test.var = FALSE, log2 = FALSE,
p_value.cutoff = 0.05, logratio = 0.5
)
I get
Error in model.matrix.formula(design(object), colData(object)) :
data must be a data.frame
In addition: Warning message:
In DESeq2::DESeqDataSet(se, design = tmp) :
some variables in design formula are characters, converting to factors
My sessionInfo is:
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C LC_TIME=English_Australia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] anamiR_1.13.0
loaded via a namespace (and not attached):
[1] backports_1.2.1 Hmisc_4.5-0 BiocFileCache_1.10.2 plyr_1.8.6
[5] splines_3.6.0 BiocParallel_1.20.1 AlgDesign_1.2.0 GenomeInfoDb_1.22.1
[9] ggplot2_3.3.3 digest_0.6.27 foreach_1.5.1 htmltools_0.5.1.1
[13] fansi_0.4.2 magrittr_2.0.1 checkmate_2.0.0 memoise_2.0.0
[17] cluster_2.1.1 limma_3.42.2 readr_1.4.0 Biostrings_2.54.0
[21] annotate_1.64.0 matrixStats_0.58.0 askpass_1.1 siggenes_1.60.0
[25] prettyunits_1.1.1 jpeg_0.1-8.1 colorspace_2.0-0 rappdirs_0.3.3
[29] blob_1.2.1 haven_2.3.1 xfun_0.22 dplyr_1.0.5
[33] crayon_1.4.1 RCurl_1.98-1.3 graph_1.64.0 genefilter_1.68.0
[37] GEOquery_2.54.1 survival_3.2-10 iterators_1.0.13 glue_1.4.2
[41] gtable_0.3.0 lumi_2.38.0 zlibbioc_1.32.0 XVector_0.26.0
[45] DelayedArray_0.12.3 questionr_0.7.4 Rhdf5lib_1.8.0 BiocGenerics_0.32.0
[49] HDF5Array_1.14.4 scales_1.1.1 rngtools_1.5 DBI_1.1.1
[53] miniUI_0.1.1.1 Rcpp_1.0.6 progress_1.2.2 xtable_1.8-4
[57] htmlTable_2.1.0 gage_2.36.0 bumphunter_1.28.0 foreign_0.8-71
[61] bit_4.0.4 mclust_5.4.7 preprocessCore_1.48.0 Formula_1.2-4
[65] stats4_3.6.0 htmlwidgets_1.5.3 httr_1.4.2 gplots_3.1.1
[69] RColorBrewer_1.1-2 ellipsis_0.3.1 pkgconfig_2.0.3 reshape_0.8.8
[73] XML_3.99-0.3 dbplyr_2.1.0 nnet_7.3-15 locfit_1.5-9.4
[77] utf8_1.2.1 tidyselect_1.1.0 rlang_0.4.10 later_1.1.0.1
[81] AnnotationDbi_1.48.0 munsell_0.5.0 tools_3.6.0 cachem_1.0.4
[85] generics_0.1.0 RSQLite_2.2.5 stringr_1.4.0 fastmap_1.1.0
[89] knitr_1.31 bit64_4.0.5 beanplot_1.2 caTools_1.18.2
[93] methylumi_2.32.0 scrime_1.3.5 purrr_0.3.4 KEGGREST_1.26.1
[97] doRNG_1.8.2 nlme_3.1-152 mime_0.10 nor1mix_1.3-0
[101] xml2_1.3.2 biomaRt_2.42.1 compiler_3.6.0 rstudioapi_0.13
[105] curl_4.3 png_0.1-7 affyio_1.56.0 klaR_0.6-15
[109] tibble_3.1.0 geneplotter_1.64.0 stringi_1.5.3 highr_0.8
[113] GenomicFeatures_1.38.2 minfi_1.32.0 forcats_0.5.1 lattice_0.20-41
[117] Matrix_1.3-2 multtest_2.42.0 vctrs_0.3.7 pillar_1.5.1
[121] lifecycle_1.0.0 BiocManager_1.30.12 combinat_0.0-8 data.table_1.14.0
[125] bitops_1.0-6 rtracklayer_1.46.0 httpuv_1.5.5 agricolae_1.3-3
[129] GenomicRanges_1.38.0 affy_1.64.0 R6_2.5.0 latticeExtra_0.6-29
[133] RMySQL_0.10.21 promises_1.2.0.1 KernSmooth_2.23-18 gridExtra_2.3
[137] nleqslv_3.3.2 IRanges_2.20.2 codetools_0.2-18 MASS_7.3-53.1
[141] gtools_3.8.2 assertthat_0.2.1 rhdf5_2.30.1 Sum
[145] openssl_1.4.3 DESeq2_1.26.0 GenomicAlignments_1.22.1 Rsamtools_2.2.3
[149] S4Vectors_0.24.4 GenomeInfoDbData_1.2.2 mgcv_1.8-34 parallel_3.6.0
[153] hms_1.0.0 quadprog_1.5-8 grid_3.6.0 rpart_4.1-15
[157] labelled_2.8.0 tidyr_1.1.3 base64_2.0 DelayedMatrixStats_1.8.0
[161] illuminaio_0.28.0 Biobase_2.46.0 shiny_1.6.0 base64enc_0.1-3
I can change R versions but that really does not help. I have identified the problem as both mrna_se@colData and miRNA@colData not being dataframes:
> is.data.frame(mirna_se@colData)
[1] FALSE
> is.data.frame(mrna_se@colData)
[1] FALSE
So how can I convert these objects within the overall s4 object to dataframes in order that DESEQ2 can use them to produce differential expression data? This is driving me insane.
Also before anyone asks:
> packageVersion("DESeq2")
[1] ‘1.26.0’
In response to the comment I changed the code as below and I get the below error.
mrna_se <- SummarizedExperiment::SummarizedExperiment(
assays = S4Vectors::SimpleList(counts=mrna),
colData = as.data.frame(pheno.mrna))
it appears that the last variable in the design formula, 'ER',
has a factor level, 'control', which is not the reference level. we recommend
to use factor(...,levels=...) or relevel() to set this as the reference level
before proceeding. for more information, please see the 'Note on factor levels'
in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) :
data must be a data.frame
Further Edits:
If you try to just read the csv files in without doing it as a matrix you get:
> library(anamiR)
>
> mrna = read.csv("D:\\file1.csv", row.names = 1, header= TRUE)
> mirna = read.csv("D:\\file2.csv", row.names = 1, header= TRUE)
> pheno.mirna = read.csv("D:\\file3.csv", row.names = 1, header= TRUE)
> pheno.mrna = read.csv("D:\\file4.csv", row.names = 1, header= TRUE)
>
> mrna_se <- SummarizedExperiment::SummarizedExperiment(
+ assays = S4Vectors::SimpleList(counts=mrna),
+ colData = as.data.frame(pheno.mrna))
Error in all_dims[, 1L] : incorrect number of dimensions
>
> mirna_se <- SummarizedExperiment::SummarizedExperiment(
+ assays = S4Vectors::SimpleList(counts=mirna),
+ colData = pheno.mirna)
Error in all_dims[, 1L] : incorrect number of dimensions
>
> mrna_d <- differExp_discrete(se = mrna_se,
+ class = "ER", method = "DESeq",
+ t_test.var = FALSE, log2 = FALSE,
+ p_value.cutoff = 0.05, logratio = 0.5
+ )
Error in SummarizedExperiment::assays(se) : object 'mrna_se' not found
>
> mirna_d <- differExp_discrete(se = mirna_se,
+ class = "ER", method = "DESeq",
+ t_test.var = FALSE, log2 = FALSE,
+ p_value.cutoff = 0.05, logratio = 0.5
+ )
Error in SummarizedExperiment::assays(se) : object 'mirna_se' not found
Traceback on my original error is (showing one file using the proposed as.data.frame solution and the other one using my original matrix loading):
> library(anamiR)
>
> mrna1 = read.csv("D:\\file1.csv.csv", row.names = 1, header= TRUE)
> mrna <- as.matrix(mrna1)
> rm(mrna1)
> mirna1 = read.csv("D:\\file2.csv.csv", row.names = 1, header= TRUE)
> mirna <- as.matrix(mirna1)
> rm(mirna1)
> pheno.mirna1 = read.csv("D:\\file3.csv.csv", row.names = 1, header= TRUE)
> pheno.mirna <- as.matrix(pheno.mirna1)
> rm(pheno.mirna1)
> pheno.mrna1 = read.csv("D:\\file4.csv.csv", row.names = 1, header= TRUE)
> pheno.mrna <- as.matrix(pheno.mrna1)
> rm(pheno.mrna1)
>
> mrna_se <- SummarizedExperiment::SummarizedExperiment(
+ assays = S4Vectors::SimpleList(counts=mrna),
+ colData = as.data.frame(pheno.mrna))
>
> mirna_se <- SummarizedExperiment::SummarizedExperiment(
+ assays = S4Vectors::SimpleList(counts=mirna),
+ colData = pheno.mirna)
>
> mrna_d <- differExp_discrete(se = mrna_se,
+ class = "ER", method = "DESeq",
+ t_test.var = FALSE, log2 = FALSE,
+ p_value.cutoff = 0.05, logratio = 0.5
+ )
it appears that the last variable in the design formula, 'ER',
has a factor level, 'control', which is not the reference level. we recommend
to use factor(...,levels=...) or relevel() to set this as the reference level
before proceeding. for more information, please see the 'Note on factor levels'
in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) :
data must be a data.frame
>
> traceback()
6: stop("data must be a data.frame")
5: model.matrix.formula(design(object), colData(object))
4: stats::model.matrix(design(object), colData(object))
3: designAndArgChecker(object, betaPrior)
2: DESeq2::DESeq(dds)
1: differExp_discrete(se = mrna_se, class = "ER", method = "DESeq",
t_test.var = FALSE, log2 = FALSE, p_value.cutoff = 0.05,
logratio = 0.5)
>
> mirna_d <- differExp_discrete(se = mirna_se,
+ class = "ER", method = "DESeq",
+ t_test.var = FALSE, log2 = FALSE,
+ p_value.cutoff = 0.05, logratio = 0.5
+ )
Error in model.matrix.formula(design(object), colData(object)) :
data must be a data.frame
In addition: Warning message:
In DESeq2::DESeqDataSet(se, design = tmp) :
some variables in design formula are characters, converting to factors
>
> traceback()
6: stop("data must be a data.frame")
5: model.matrix.formula(design(object), colData(object))
4: stats::model.matrix(design(object), colData(object))
3: designAndArgChecker(object, betaPrior)
2: DESeq2::DESeq(dds)
1: differExp_discrete(se = mirna_se, class = "ER", method = "DESeq",
t_test.var = FALSE, log2 = FALSE, p_value.cutoff = 0.05,
logratio = 0.5)
Traceback on the new one is:
> library(anamiR)
>
> mrna = read.csv("D:\\file1.csv", row.names = 1, header= TRUE)
> mirna = read.csv("D:\\file2.csv", row.names = 1, header= TRUE)
> pheno.mirna = read.csv("D:\\file3.csv", row.names = 1, header= TRUE)
> pheno.mrna = read.csv("D:\\file4.csv", row.names = 1, header= TRUE)
>
> mrna_se <- SummarizedExperiment::SummarizedExperiment(
+ assays = S4Vectors::SimpleList(counts=mrna),
+ colData = as.data.frame(pheno.mrna))
Error in all_dims[, 1L] : incorrect number of dimensions
> traceback()
7: method(object)
6: validityMethod(as(object, superClass))
5: isTRUE(x)
4: anyStrings(validityMethod(as(object, superClass)))
3: validObject(ans)
2: Assays(assays)
1: SummarizedExperiment::SummarizedExperiment(assays = S4Vectors::SimpleList(counts = mrna),
colData = as.data.frame(pheno.mrna))
>
> mirna_se <- SummarizedExperiment::SummarizedExperiment(
+ assays = S4Vectors::SimpleList(counts=mirna),
+ colData = pheno.mirna)
Error in all_dims[, 1L] : incorrect number of dimensions
> traceback()
7: method(object)
6: validityMethod(as(object, superClass))
5: isTRUE(x)
4: anyStrings(validityMethod(as(object, superClass)))
3: validObject(ans)
2: Assays(assays)
1: SummarizedExperiment::SummarizedExperiment(assays = S4Vectors::SimpleList(counts = mirna),
colData = pheno.mirna)
Latest Edits (12/4/21)
So in response to comments I am now loading my data files as below:
mrna <- as.matrix(read.csv("D:\\CorrelationDataProcessing\\TRAMP30w\\mrnaTRAMP_Mut30w_v_WT30w_normcounts.csv", row.names = 1, header= TRUE))
mirna <- as.matrix(read.csv("D:\\CorrelationDataProcessing\\TRAMP30w\\mirnaTRAMP_Mut30w_vs_WT30w_normcounts.csv", row.names = 1, header= TRUE))
pheno.mirna = read.csv("D:\\CorrelationDataProcessing\\TRAMP30w\\mirnapheno.csv", row.names = 1, header= TRUE)
pheno.mrna = read.csv("D:\\CorrelationDataProcessing\\TRAMP30w\\mrnapheno.csv", row.names = 1, header= TRUE)
This results in:
> mrna_d <- differExp_discrete(se = mrna_se,
+ class = "ER", method = "DESeq",
+ t_test.var = FALSE, log2 = FALSE,
+ p_value.cutoff = 0.05, logratio = 0.5
+ )
it appears that the last variable in the design formula, 'ER',
has a factor level, 'control', which is not the reference level. we recommend
to use factor(...,levels=...) or relevel() to set this as the reference level
before proceeding. for more information, please see the 'Note on factor levels'
in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) :
data must be a data.frame
>
> mirna_d <- differExp_discrete(se = mirna_se,
+ class = "ER", method = "DESeq",
+ t_test.var = FALSE, log2 = FALSE,
+ p_value.cutoff = 0.05, logratio = 0.5
+ )
it appears that the last variable in the design formula, 'ER',
has a factor level, 'control', which is not the reference level. we recommend
to use factor(...,levels=...) or relevel() to set this as the reference level
before proceeding. for more information, please see the 'Note on factor levels'
in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) :
data must be a data.frame