1

Using MongoDB, I would like to execute a find and count in a single query and print the result. I previously found this thread but it doesn't answer my question. When I replaced with the adequate information, my query still failed because of memory limit. That's what my documents looks like:

{ "_id" : ObjectId("5ca47bca0953f323b39019b2"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 69511, "Reference" : "A", "Mutation" : "G", "ReadDepth" : 206 }
{ "_id" : ObjectId("5ca47bca0953f323b39019cd"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 942451, "Reference" : "T", "Mutation" : "C", "ReadDepth" : 65 }
{ "_id" : ObjectId("5ca47bca0953f323b39019d5"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 946247, "Reference" : "G", "Mutation" : "A", "ReadDepth" : 114 }
{ "_id" : ObjectId("5ca47bca0953f323b39019d3"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 952421, "Reference" : "A", "Mutation" : "G", "ReadDepth" : 258 }
{ "_id" : ObjectId("5ca47bca0953f323b39019d4"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 953259, "Reference" : "T", "Mutation" : "C", "ReadDepth" : 161 }
{ "_id" : ObjectId("5ca47bca0953f323b39019d8"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 953279, "Reference" : "T", "Mutation" : "C", "ReadDepth" : 155 }
{ "_id" : ObjectId("5ca47bca0953f323b39019db"), "Sample" : "test-exome-1_hg38", "Chromosome" : "chr1", "Position" : 961945, "Reference" : "G", "Mutation" : "C", "ReadDepth" : 205 }

I could do:

db.test_mutindiv.find({"Sample": "test-exome-1_hg38", "Chromosome": "chr1", "Position": 69511, "Reference": "A", "Mutation": "G"})
db.test_mutindiv.find({"Sample": "test-exome-1_hg38", "Chromosome": "chr1", "Position": 69511, "Reference": "A", "Mutation": "G"}).count()

I tried the following:

db.test_mutindiv.aggregate(
    [
        { "$project": { 
            "Sample": "test-exome-1_hg38",
            "Chromosome":"chr1",
            "Position": 17512,
            "Reference": "C",
            "Mutation": "G",
            "count": { "$sum": 1 }
        }},
    ]
)

and

db.test_mutindiv.aggregate(
    [
        { "$group": {
            "_id": null, 
            "docs": { "$push": "$$ROOT" }, 
            "count": { "$sum": 1 }
        }},
        { "$project": { "_id": 0, "count": 1, "docs": { "$slice": [ "$docs", 5 ] } }}
    ]
)

but none of them were working. Ultimately, I would like to obtain the following format:

test-exome-1_hg38,chr1,69511,A,G,2
user324810
  • 597
  • 8
  • 20

1 Answers1

3

You need $match to apply your filtering condition before you count / return matching documents. Then you can take advantage of $facet which allows you to run multiple aggregations on that filtered data set:

db.test_mutindiv.aggregate([
    {
        $match: {"Sample": "test-exome-1_hg38", "Chromosome": "chr1", "Position": 69511, "Reference": "A", "Mutation": "G"}
    },
    {
        $facet: {
            count: [ { $count: "total" } ],
            docs: [ { $match: {} } ]
        }
    },
    {
        $unwind: "$count"
    }
])
mickl
  • 48,568
  • 9
  • 60
  • 89