Seems like the obvious way is to get all header data somehow (see further below), and then iterate through the collection and if you were to write by hand (which people don't encourage), string build, writing to file in batches (if your collection were quite large).
HashSet<string> fields = new HashSet<string>();
BsonDocument query = BsonDocument.Parse(filter);
var result = database.GetCollection<BsonDocument>(collection).Find(new BsonDocument());
// Populate fields with all unique fields, see below for examples how.
var csv = new StringBuilder();
string headerLine = string.Join(",", fields);
csv.AppendLine(headerLine);
foreach (var element in result.ToListAsync().Result)
{
string line = null;
foreach (var field in fields)
{
BsonValue value;
if (field.Contains("."))
{
value = GetNestedField(element, field);
}
else
{
value = element.GetElement(field).Value;
}
// Example deserialize to string
switch (value.BsonType)
{
case BsonType.ObjectId:
line = line + value.ToString();
break;
case BsonType.String:
line = line + value.ToString();
break;
case BsonType.Int32:
line = line + value.AsInt32.ToString();
break;
}
line = line + ",";
}
csv.AppendLine(line);
}
File.WriteAllText("D:\\temp.csv", csv.ToString());
In the case of your own objects you'd have to use your own deserializer.
HOWEVER I'd recommend using the mongoexport tool if you can.
You could simply run the exe from your application, feeding in arguments as required. Keep in mind though, that it requires explicit fields.
ProcessStartInfo startInfo = new ProcessStartInfo();
startInfo.FileName = "C:\mongodb\bin\mongoexport.exe";
startInfo.Arguments = "-d testDB -c testCollection --type csv --fields name,address.street,address.zipCode --out .\output.csv";
startInfo.UseShellExecute = false;
Process exportProcess= new Process();
exportProcess.StartInfo = startInfo;
exportProcess.Start();
exportProcess.WaitForExit();
More on mongoexport such as paging, additional queries and field file:
https://docs.mongodb.com/manual/reference/program/mongoexport/
Getting Unique Field Names
In order to find ALL field names you could do this a number of ways. Using BsonDocument as a generic data example.
- Recursively traverse through your IMongoCollection results. This is going to have to be through the entire collection, so performance may not be great.
Example:
HashSet<string> fields = new HashSet<string>();
var result = database.GetCollection<BsonDocument>(collection).Find(new BsonDocument());
var result = database.GetCollection<BsonDocument>(collection).Find(new BsonDocument());
foreach (var element in result.ToListAsync().Result)
{
ProcessTree(fields, element, "");
}
private void ProcessTree(HashSet<string> fields, BsonDocument tree, string parentField)
{
foreach (var field in tree)
{
string fieldName = field.Name;
if (parentField != "")
{
fieldName = parentField + "." + fieldName;
}
if (field.Value.IsBsonDocument)
{
ProcessTree(fields, field.Value.ToBsonDocument(), fieldName);
}
else
{
fields.Add(fieldName);
}
}
}
- Perform a MapReduce operation to return all fields. Scanning nested fields becomes more complex with this method however. See this.
Example:
string map = @"function() {
for (var key in this) { emit(key, null); }
}";
string reduce = @"function(key, stuff) { return null; }";
string finalize = @"function(key, value){
return key;
}";
MapReduceOptions<BsonDocument, BsonValue> options = new MapReduceOptions<BsonDocument, BsonValue>();
options.Finalize = new BsonJavaScript(finalize);
var results = database.GetCollection<BsonDocument>(collection).MapReduceAsync(
new BsonJavaScript(map),
new BsonJavaScript(reduce),
options).Result.ToListAsync().Result;
foreach (BsonValue result in results.Select(item => item["_id"]))
{
Debug.WriteLine(result.AsString);
}
- Perform an Aggregation operation. You'd need to unwind as many times as required to get all nested fields.
Example:
string[] pipeline = new string[3];
pipeline[0] = "{ '$project':{ 'arrayofkeyvalue':{ '$objectToArray':'$$ROOT'}}}";
pipeline[1] = "{ '$unwind':'$arrayofkeyvalue'}";
pipeline[2] = "{ '$group':{'_id':null,'fieldKeys':{'$addToSet':'$arrayofkeyvalue.k'}}}";
var stages = pipeline.Select(s => BsonDocument.Parse(s)).ToList();
var result = await database.GetCollection<BsonDocument>(collection).AggregateAsync<BsonDocument>(stages);
foreach (BsonValue fieldName in result.Single().GetElement("fieldKeys").Value.AsBsonArray)
{
Debug.WriteLine(fieldName.AsString);
}
Nothing perfect here and I couldn't tell you which is the most efficient but hopefully something to help.