I was following the workflow described here https://f1000research.com/articles/5-1492/v2 using the sample data as well as my own data. This worked fine, but now I can't generate an OTU table, that contains a header such as "OTU00004" or even better "kingdom_phylum_..._Pseudomonas_OTU00004". I would like to use such a table to find and plot the abundance of a certain OTU over several samples.
I created an object called ps, that seems to be ok:
ps <- phyloseq(tax_table(taxtab), sample_data(samdf),
otu_table(seqtab, taxa_are_rows = FALSE),phy_tree(fitGTR$tree))
> ps
phyloseq-class experiment-level object
otu_table() OTU Table: [ 454 taxa and 360 samples ]
sample_data() Sample Data: [ 360 samples by 14 sample variables ]
tax_table() Taxonomy Table: [ 454 taxa by 6 taxonomic ranks ]
phy_tree() Phylogenetic Tree: [ 454 tips and 452 internal nodes ]
but the headers in the OTU table and the corresponding rows in the taxonomy table are actual (here shortened) sequences
> head(otu_table(ps)[1])
GCAAGCGTTACTCGGAATCACTGGGCGTAAAGAGCGCGTAGGCGG#shortened
F3D0 0
> head(tax_table(ps)[1])
Taxonomy Table: [1 taxa by 6 taxonomic ranks]:
Kingdom
GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGCAGGCGGA#shortened "Bacteria"
Is there a way to combine the information from otu table and taxonomy table and replace the sequences with numbered OTU ids? I have checked several phyloseq resources and FAQs, but I can't find an answer to this.
I would like to have a table looking like this:
taxonomy_OTU00001 taxonomy_OTU00002 taxonomy_OTU00003
F3D0 #counts #counts #counts
F3D1 #counts #counts #counts
F3D11 #counts #counts #counts
F3D125 #counts #counts #counts
As the workflow until this step is quite time consuming, I'm not sure how to provide a reproducible example for this problem.
EDIT: I generated a sample subset following dww's suggestion.
short_otu2 = short_otu = head(otu_table(ps)[,c(1:6)]) # seq as colnames
short_tax2 = short_tax = tax_table(ps)[colnames(short_otu), ] # seq as rownames
# shorten seqs, must still be unique
colnames(short_otu2) <- substr(colnames(short_otu), 0, 50)
rownames(short_tax2) <- substr(rownames(short_tax), 0, 50)
library(phyloseq)
> dput(short_otu2)
new("otu_table", .Data = structure(c(526L, 375L, 2931L, 994L,
2061L, 419L, 319L, 330L, 1737L, 623L, 1868L, 350L, 402L, 207L,
1880L, 577L, 887L, 303L, 413L, 64L, 838L, 698L, 939L, 484L, 146L,
126L, 496L, 440L, 1183L, 184L, 462L, 37L, 26L, 782L, 271L, 310L
), .Dim = c(6L, 6L), .Dimnames = list(c("F3D0", "F3D1", "F3D11",
"F3D125", "F3D13", "F3D141"), c("GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGCAGGCGGAAGAT",
"GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGCAGGCGGACTCT", "GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGGCTGT",
"GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGGCTTT", "CCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGTGGATTGT",
"GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGCCTGC"))), taxa_are_rows = FALSE)
> dput(short_tax2)
new("taxonomyTable", .Data = structure(c("Bacteria", "Bacteria",
"Bacteria", "Bacteria", "Bacteria", "Bacteria", "Bacteroidetes",
"Bacteroidetes", "Bacteroidetes", "Bacteroidetes", "Bacteroidetes",
"Bacteroidetes", "Bacteroidia", "Bacteroidia", "Bacteroidia",
"Bacteroidia", "Bacteroidia", "Bacteroidia", "Bacteroidales",
"Bacteroidales", "Bacteroidales", "Bacteroidales", "Bacteroidales",
"Bacteroidales", "Bacteroidales_S24-7_group", "Bacteroidales_S24-7_group",
"Bacteroidales_S24-7_group", "Bacteroidales_S24-7_group", "Bacteroidaceae",
"Bacteroidales_S24-7_group", NA, NA, NA, NA, "Bacteroides", NA
), .Dim = c(6L, 6L), .Dimnames = list(c("GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGCAGGCGGAAGAT",
"GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGCAGGCGGACTCT", "GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGGCTGT",
"GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGGCTTT", "CCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGTGGATTGT",
"GCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGCCTGC"), c("Kingdom",
"Phylum", "Class", "Order", "Family", "Genus"))))