Background:
I am writing a debugger for an interpreted language in C#. I only write a "debugger server" handles the language-specific tasks, such as listing local variables. The result is then sent to a "debugger client" (a text editor that displays the listed variables) such as VS Code for example.
The debugger uses the Language Service Protocol by Microsoft, so it needs a specific format. In order to properly display complex data types (objects, arrays) in the debugger window, every object needs to have assigned a variableReference
, which is a unique integer ID assigned to that object.
When the user clicks on an object in order to "unfold" it in the "Variables" display, a request is send to the debugger server with the variable reference, and a response is sent back to the client with the values inside that object.
My progress so far:
In order to identify which object corresponds to which ID, I have created a bi-directional map between objects that the debugger keeps track of, and their IDs (implemented as 2 dictionaries).
When I see an object in the debugger, I try to look it up in this map to get it's ID, if it is present. If it's not, I assign it a new ID, and save it into the map for later. When the frontend asks for values in an object (by user clicking on it), I look up the object in the map by it's ID and resolve the request.
The problem:
How to save the objects in the map, so I can look up an existing's object ID, or easily tell if an object is not yet registered?
I have tried using a hashmap (dictionary), where hash is computed from the object's address, and equality is implemented as reference equality. Note that computing the hash from the object content is not possible, since the contents of the object can change during debugging, so the object would not be found in the dictionary after it's hash has changed.
I need a hash that would stay the same, even when contents of the object change. Object address seems perfect for that, however I can't find a solution that works reliably, for example, this:
public int GetHashCode(object obj)
{
GCHandle gch = GCHandle.Alloc(obj, GCHandleType.Pinned);
IntPtr ptr = gch.AddrOfPinnedObject();
return ptr.ToInt32();
}
throws an Object contains non-primitive or non-blittable data.
exception.
However that is not the only problem. It is my understanding that the GC can reallocate entire portions of the memory and fix all the addresses used when it does so. This would make the address change, and break the hashcode.
Possible solutions:
- Get rid of the hashmap (dictionary): When searching for existing ID of an object, manually reference compare it to all the existing objects. This would fix the problem, but would be rather slow.
- Manually add the ID directly to the object on creation: This seems like making most sense, but I am making the debugger as a mod, so this would require too much change of the code for a mod IMO.
- Fix the GetAddress method: Even with the address change, this is a valid solution, because the object can be in the map twice and the debugger will still work properly.
- Allocate new ID for each seeing of the object: This would mean only one-directional map from ID to object, and each object would be multiple times in it with different IDs. It would work, but very heavy on the memory.
- Get rid of the map altogether: This would require not only to fix the GetAddress method, but use the address as the ID directly, and then cast from ID (address) to object directly. This can be dangerous, since the address can contain arbitrary binary data by the time the ID request comes, plus will not work with
long
addresses, since the ID is only an integer.
Or any other possible solutions? How do debuggers usually solve this problem?