I have a code that retrieves 50k records from elastic search, so documents are json with nested properties.
{
"name":"John",
"address": {
"addressLine1":"London, Baker st. 221b",
"postcode":"1234"
}
}
Then I need to flatten each record to be like this:
{
"name":"John",
"address.addressLine1":"London, Baker st. 221b",
"address.postcode":"1234"
}
So I use this code to do it:
// returned document from ES mapped to object because data has no structure, so we cannot build strong typed model
List<object> documents = GetDocumentsFromES();
var transformedResuld = documents.Select(doc=>
{
var jObject = JObject.FromObject(document);
//some nodes we still need to remain as is, no need to flatten, so we remove them from original jObject and will attach later to transformed one
var skipDictionary = new Dictionary<string, JToken>();
foreach (var skipField in fieldsToSkipTransformation)
{
if (jObject.ContainsKey(skipField))
{
var item = jObject.SelectToken(skipField);
skipDictionary.Add(skipField, item);
jObject.Remove(skipField);
}
}
var dict = jObject.Descendants()
.Where(j => !j.Children().Any())
.ToDictionary(k => k.Path, v => v.ToString()); //Excepiton thrown here
var transformedObject = JObject.FromObject(dict);
foreach (var skipFieldKey in skipDictionary.Keys)
{
transformedObject.Add(skipFieldKey, skipDictionary[skipFieldKey]);
}
return transformedObject;
}
I don't understand why out of memory exception happens when we are trying to create dictionary from this JObject. OutOfMemoryException
happens when one object exceeds the limits of allocated memory, but I assume that as dictionary is created in scopes of delegate function allocated memory should be cleared once we execution left the scope of one iteration. All objects in ES are not big to be a problem.
My first thought was about immutable strings as it produces quite a lot of them, but again as it is all happens in scope of iteration I don't understand why memory is not enough.