5

There's probably 5 or 6 SO posts that tangentially touch on this, but none really answer the question.

I have a Dictionary object I use kind of as a cache to store values. The problem is I don't know how big it is getting -- over time it might be growing big or not, but I cannot tell and so I can't gauge its effectiveness, or make conclusions about how a user is using the software. Because this is a piece that will go into production and monitor something over a very long period of time, it doesn't make sense to attach memory profiler or anything debuggy like that.

Ideally, I would simply place a call in my Timer that would do something like:

private void someTimer_Tick(object sender, EventArgs e)
{
   ...
   float mbMem = cacheMap.GetMemorySize();
   RecordInLog(DateTime.Now.ToString() + ": mbMem is using " + mbMem.ToString() + "MB of memory");
   ...
}

Can this be done without attaching some debug tool so that it can be used deployed scenario?

kmarks2
  • 4,755
  • 10
  • 48
  • 77
  • 2
    Perhaps you have already seen this, but just in case - you can do serrialization approximation http://stackoverflow.com/questions/605621/how-to-get-object-size-in-memory – oleksii Dec 05 '11 at 15:38
  • 2
    Is counting bytes necessary? Could simply counting the number of keys or unique values be good enough? – Sean U Dec 05 '11 at 16:17
  • @oleksii This might be the only solution. I'm still trying to find a way around the substantial performance hit you take when you Serialize things (it's a huge hit). – kmarks2 Dec 05 '11 at 19:14
  • If this is being used in production, I'd consider using performance counters to track size (or at least the count of items). – Roger Lipscombe Dec 05 '11 at 19:51
  • Sounds like premature optimization. – John Saunders Dec 05 '11 at 20:27

2 Answers2

1

There is no method in the framework for telling you how much memory an object is using, because doing that for any kind of object would be very complicated.

If you know that your dictionary is easy to profile, i.e. it doesn't contain duplicate references to the same item, and the memory usage of each item can easily be approximated, you can just loop through the dictionary and sum up the approximate size of each object.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • Yeah, keys are unique. This was my fallback was to just estimate the size of the k/v pair and scale that value as the size of the collection. My main problem is that the value in the pair is a variable length string, and once I break 1 million records that +/-50% wiggle room in the value can be large. – kmarks2 Dec 05 '11 at 19:07
  • 1
    For strings the memory usage is usually simple, being length * 2 plus some overhead (12/20 bytes depennding on 32/64 bit platform). Note however that if you create strings using a `StringBuilder` the string may contain extra unused space at the end. – Guffa Dec 05 '11 at 19:36
  • This is a thought...I'll try this and see if the fact that there are 1-2 million entries is a performance deal breaker. – kmarks2 Dec 05 '11 at 19:39
1

Given your most recent comment, that the value is a variable length string, it should be easy enough to calculate the size of each item in the dictionary. I would consider saving time and effort by creating your own caching object (possibly just wrapping a Dictionary) and keeping track of the total size as items are added to and removed from the cache. This way, at any point in time you can tell the total size of the values in the cache by looking the value that you have been keeping track of all along.

If you need your cache to expose the full IDictionary functionality, you could implement the interface, delegating down to the "real" dictionary and modifying the cumulative size value in the Add and Remove operations. If you don't need your cache to expose the full IDictionary functionality, simply define a stripped down interface (with maybe just Add, Contains, and Remove methods and a CumulativeSize property. Or, you might decide to implement a caching object without an interface. If it were me, I would either use IDictionary or define an interface, like ICache.

So, your cache might look something like this (uncompiled and untested):

public interface ICacheWithCumulativeSize
{
  void Add(string key, string value);
  bool Contains(string key);
  void Remove(string key);
  int CumulativeSize { get; }
}

public class MyCache : ICacheWithCumulativeSize
{
  private IDictionary<string, string> dict = new Dictionary<string, string>();

  public void Add(string key, string value)
  {
    CumulativeSize += value.Length;
    dict[key] = value;
  }

  public bool Contains(string key)
  {
    return dict.ContainsKey(key);
  }

  public void Remove(string key)
  {
    string toRemove = dict[key];
    CumulativeSize -= value.Length;
    dict.Remove(key);
  }

  int CumulativeSize { public get; private set; }
}

This is pretty rough. Obviously it could be more efficient and more robust. I am not doing any checking in Add and Remove to see if a key already exists, etc, but I think you probably get the idea. Also, it is possible that the strings that are stored as values in the dictionary could be modified externally (maybe not in your program, but theoretically), so the length of a string when it is subtracted from the CumulativeSize when a value is removed from the cache might not be the same as the length of that string when it was originally added. If this is is a concern, you could consider storing copies of the values in the internal dictionary. I don't know enough about your application to say whether this is a good idea or not.

For completeness... Here is a rough implementation that simply wraps a dictionary, exposes the IDictionary interface, and keeps track of the total size of items in the cache. It has a little more defensive code, primarily to protect the size accumulator. The only part that I might consider tricky is the index setter... My implementation checks to see if the index being set already exists. If so, the cumulative value is decremented appropriately and then incremented based on the size of the input value. Otherwise, I think it is pretty straightforward.

  public class MySpecialDictionary : IDictionary<string, string>
  {
    private IDictionary<string, string> dict = new Dictionary<string, string>();

    public int TotalSize { get; private set; }

    #region IDictionary<string,string> Members

    public void Add(string key, string value)
    {
      dict.Add(key, value);
      TotalSize += string.IsNullOrEmpty(value) ? 0 : value.Length;
    }

    public bool ContainsKey(string key)
    {
      return dict.ContainsKey(key);
    }

    public ICollection<string> Keys
    {
      get { return dict.Keys; }
    }

    public bool Remove(string key)
    {
      string value;
      if (dict.TryGetValue(key, out value))
      {
        TotalSize -= string.IsNullOrEmpty(value) ? 0 : value.Length;
      }
      return dict.Remove(key);
    }

    public bool TryGetValue(string key, out string value)
    {
      return dict.TryGetValue(key, out value);
    }

    public ICollection<string> Values
    {
      get { return dict.Values; }
    }

    public string this[string key]
    {
      get
      {
        return dict[key];
      }
      set
      {
        string v;
        if (dict.TryGetValue(key, out v))
        {
          TotalSize -= string.IsNullOrEmpty(v) ? 0 : v.Length;
        }
        dict[key] = value;
        TotalSize += string.IsNullOrEmpty(value) ? 0 : value.Length;
      }
    }

    #endregion

    #region ICollection<KeyValuePair<string,string>> Members

    public void Add(KeyValuePair<string, string> item)
    {
      dict.Add(item);
      TotalSize += string.IsNullOrEmpty(item.Value) ? 0 : item.Value.Length;
    }

    public void Clear()
    {
      dict.Clear();
      TotalSize = 0;
    }

    public bool Contains(KeyValuePair<string, string> item)
    {
      return dict.Contains(item);
    }

    public void CopyTo(KeyValuePair<string, string>[] array, int arrayIndex)
    {
      dict.CopyTo(array, arrayIndex);
    }

    public int Count
    {
      get { return dict.Count; }
    }

    public bool IsReadOnly
    {
      get { return dict.IsReadOnly; }
    }

    public bool Remove(KeyValuePair<string, string> item)
    {
      string v;
      if (dict.TryGetValue(item.Key, out v))
      {
        TotalSize -= string.IsNullOrEmpty(v) ? 0 : v.Length;
      }
      return dict.Remove(item);
    }

    #endregion

    #region IEnumerable<KeyValuePair<string,string>> Members

    public IEnumerator<KeyValuePair<string, string>> GetEnumerator()
    {
      return dict.GetEnumerator();
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
      return dict.GetEnumerator();
    }

    #endregion
  }

Good luck!

wageoghe
  • 27,390
  • 13
  • 88
  • 116
  • I got through your first paragraph only before marking this as the solution. The solution, however implemented, needs to keep a rolling sum to be performant. It's so clearly and simply the correct answer I'm facepalming myself that I didn't think of it myself. Thanks again. – kmarks2 Dec 05 '11 at 20:10