I have a file of SNPs for two mouse strains (B6 and CAST) in this format:
Chr Position B6.SNP CAST.SNP B6.Count CAST.Count
chr17 1000 A G 102 82
chr17 2000 A G 91 76
chr17 5000 C T 55 87
chr17 7500 G A 70 36
chr17 10200 A G 83 45
chr17 17000 C T 34 91
chr17 20000 G A 95 46
I would like to group the SNPs (i.e rows) by 'Chr' and 'Position' in 10,000 bp bins along the chromosome; in other words, group all SNPs that fall within 0-10,000bp, then 10,0001-20,000bp, and so forth). In addition, for all the SNPs that fall within each bin, I want to sum their B6 and CAST counts, as well as create a new column that includes the number of SNPs that fell within each bin.
So perhaps an output file (using the example above):
Chr Start End SNP.Count B6.Count.Sum CAST.Count.Sum
chr17 0 10000 4 318 281
chr17 10001 20000 3 212 182
Thanks in advance.