Looking for code in R to summarize by H or D?

Question

I have a chart with ASV's per sample, the samples are sorted by number (sample) and a letter which corresponds to human or dog. I am trying to see which ASV's are in only humans, or only dogs. My thought for how to do this is sum all rows by dog or human, ignoring individual samples, and see values of 0 or greater than zero.

I am unsure of code, have tried a few things but none have worked. Mainly working with phyloseq and DESeq2.This is the table Im working with, 11,000 ASV samples.

Welcome to SO! It will be easier to get good help on this forum if you can review the guidelines for how to ask a good R question, especially the link on making a reproducible example: https://stackoverflow.com/tags/r/info. — Jon Spring, Aug 30 '22 at 17:20
For instance, it will be much more useful if you can include some sample data in your question as code and not as a picture, which we can't use without retyping. It would also help if you can include any code you've started with that isn't doing what you want. — Jon Spring, Aug 30 '22 at 17:21
It's easier to help you if you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Please [do not post code or data in images](https://meta.stackoverflow.com/q/285551/2372064) — MrFlick, Aug 30 '22 at 17:39
Greetings! Typically it is recommended to provide a minimally reproducible dataset with your question. One way of achieving this is by using the `dput` command. You can check out how to do this at this video: https://youtu.be/3EID3P1oisg — Shawn Hemelstrand, Aug 30 '22 at 22:56

score 0 · Answer 1 · answered Aug 30 '22 at 17:57

I'm a little confused what the row names and column names represent but I gave it a go. Correct me if this is not exactly what you meant.

The data.table package has a neat function, melt( ) that allows you to transform data from wide to long format. This will make it easier for you to analyze and sum your values.

library(data.table)

data <- data.table(
  `ASV_ID` = c(3,5,6,7,10,11,12,14,15,16,20),
  `2104H` = c(0,353,483,305,289,200,0,0,0,284,406),
  `2104D` = c(470,39,43,427,48,488,356,390,482,0,0),
  `2105H` = c(0,784,816,0,704,100,0,0,0,158,141),
  `2105D` = c(0,0,0,0,0,0,0,0,0,0,0))

data
    ASV_ID 2104H 2104D 2105H 2105D
 1:      3     0   470     0     0
 2:      5   353    39   784     0
 3:      6   483    43   816     0
 4:      7   305   427     0     0
 5:     10   289    48   704     0
 6:     11   200   488   100     0
 7:     12     0   356     0     0
 8:     14     0   390     0     0
 9:     15     0   482     0     0
10:     16   284     0   158     0
11:     20   406     0   141     0

data2 <- melt(
  data = data,
  id.vars = c("ASV_ID"),
  measure.vars = c("2104H","2104D","2105H","2105D"),
  variable.name = "sample",
  value.name = "value")


data2[,.(Sum = sum(value)),by=.(sample)]
   sample  Sum
1:  2104H 2320
2:  2104D 2743
3:  2105H 2703
4:  2105D    0

Looking for code in R to summarize by ____H or ____D?

1 Answers1

Looking for code in R to summarize by H or D?