0

I have a code that retrieves 50k records from elastic search, so documents are json with nested properties.

{
   "name":"John",
   "address": {
       "addressLine1":"London, Baker st. 221b",
       "postcode":"1234"
   }
}

Then I need to flatten each record to be like this:

{
   "name":"John",
   "address.addressLine1":"London, Baker st. 221b",
   "address.postcode":"1234"       
}

So I use this code to do it:

// returned document from ES mapped to object because data has no structure, so we cannot build strong typed model
List<object> documents = GetDocumentsFromES();
var transformedResuld = documents.Select(doc=>
{
    var jObject = JObject.FromObject(document);
    //some nodes we still need to remain as is, no need to flatten, so we remove them from original jObject and will attach later to transformed one
    var skipDictionary = new Dictionary<string, JToken>();

    foreach (var skipField in fieldsToSkipTransformation)
    {
        if (jObject.ContainsKey(skipField))
        {
            var item = jObject.SelectToken(skipField);
            skipDictionary.Add(skipField, item);
            jObject.Remove(skipField);
        }
    }


    var dict = jObject.Descendants()
        .Where(j => !j.Children().Any())
        .ToDictionary(k => k.Path, v => v.ToString()); //Excepiton thrown here
        

    var transformedObject = JObject.FromObject(dict);

    foreach (var skipFieldKey in skipDictionary.Keys)
    {
        transformedObject.Add(skipFieldKey, skipDictionary[skipFieldKey]);
    }
        
    return transformedObject;
}

I don't understand why out of memory exception happens when we are trying to create dictionary from this JObject. OutOfMemoryException happens when one object exceeds the limits of allocated memory, but I assume that as dictionary is created in scopes of delegate function allocated memory should be cleared once we execution left the scope of one iteration. All objects in ES are not big to be a problem.

My first thought was about immutable strings as it produces quite a lot of them, but again as it is all happens in scope of iteration I don't understand why memory is not enough.

StNickolas
  • 576
  • 7
  • 20
  • 1
    Could you please fix the second json because it is not a valid one? – Peter Csala Aug 11 '21 at 09:50
  • Please check [this solution](https://stackoverflow.com/a/57900473/13268855) – Peter Csala Aug 11 '21 at 11:21
  • @PeterCsala There is no problem in flattening json. Problem in memory consuming in 2 dimensional loop – StNickolas Aug 12 '21 at 13:25
  • Have done any kind of memory profiling? That would help us to better understand why do you run out of memory. As a good starting point I would suggest to profile your app, with [Visual Studio built-in memory profiler](https://learn.microsoft.com/en-us/visualstudio/profiling/memory-usage?view=vs-2019) – Peter Csala Aug 12 '21 at 13:45
  • @PeterCsala problem is that on my local machine it works fine, but when we deploy on test environment inside docker it fails. I will try to profile it on local though but I never done it before, and I use Rider as IDE – StNickolas Aug 13 '21 at 06:40
  • You can also use JetBrains' [dotTrace](https://www.jetbrains.com/profiler/) or [CodeTrack](https://www.getcodetrack.com/) for this purpose. – Peter Csala Aug 13 '21 at 06:59

0 Answers0