0

I have a unique problem that I have to solve and am having trouble solving it in a way that doesn't take a ridiculous amount of time. I am using MongoDB and pulling collections that can be anywhere from 20,000-100,000+ documents big.

What I need to do is find out which columns in each collection have 20 or less DISTINCT values to them. For instance, a FirstName field is going to get weeded out, because there will be many different values. However, maybe a field like State only has 10 different values, I want to keep that one.

I can get the logic to work by simply looping through everything, but as you can imagine, that is taking forever and I need to find a better way. Is there something with Mongo aggregation that can help me solve this problem? If so, can anybody point me in the right direction?

EDIT: I know this got marked as a duplicate, but I don't think the other answer really pertains to mine. That is explaining strictly working with mongo.. I'm using the C# driver.

halterdev
  • 333
  • 4
  • 17
  • edited.. I don't think this is a duplicate. – halterdev Jan 09 '17 at 19:51
  • You haven't really explained why you feel that using the c# driver means the duplicate is invalid - the driver is just an interface to the same database system, after all... Can't you still use the `distinct` command? – Simon MᶜKenzie Jan 10 '17 at 02:11

0 Answers0