110

In the .NET BCL there are circular references between:

  • System.dll and System.Xml.dll
  • System.dll and System.Configuration.dll
  • System.Xml.dll and System.Configuration.dll

Here's a screenshot from .NET Reflector that shows what I mean:

enter image description here

How Microsoft created these assemblies is a mystery to me. Is a special compilation process required to allow this? I imagine something interesting is going on here.

SuperBiasedMan
  • 9,814
  • 10
  • 45
  • 73
Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
  • 2
    Very good question. I've never actually taken the time to inspect this, but I'm curious to know the answer. Indeed, it seems like Dykam has provided a sensible one. – Noldorin Aug 22 '09 at 17:53
  • 3
    why are those dll's not merged into one, if they all require each other? is there any practical reason for that? – Andreas Petersson Aug 22 '09 at 17:54
  • 1
    Interesting question... I'd like to know Eric Lippert's answer to this one ! And as Andreas said, I wonder why they didn't put everything in the same assembly... – Thomas Levesque Aug 22 '09 at 18:02
  • Well if one assembly needs to get updated, they won't need to touch the other ones. Thats the only reason i see. Interesting question though – Atmocreations Aug 23 '09 at 07:36
  • 2
    Take a look at this presentation (asmmeta files): http://www.msakademik.net/academicdays2005/Serge_Lidin.ppt – Mehrdad Afshari Aug 23 '09 at 14:26
  • @Andreas Petersson -- my guess is that assemblies are loaded lazily, so there's a chance that something using `mscorlib` might not necessarily use the configuration or XML APIs, in which case less memory is devoted to storing the IL. – Drew Noakes Nov 09 '09 at 13:30
  • @Mehrdad -- the link you pointed to is gone, but this has it: https://web.archive.org/web/20100806233100/http://www.msakademik.net/academicdays2005/Serge_Lidin.ppt – J.Merrill Dec 05 '14 at 04:21

9 Answers9

59

I can only tell how the Mono Project does this. The theorem is quite simple, though it gives a code mess.

They first compile System.Configuration.dll, without the part needing the reference to System.Xml.dll. After this, they compile System.Xml.dll the normal way. Now comes the magic. They recompile System.configuration.dll, with the part needing the reference to System.Xml.dll. Now there's a successful compilation with the circular reference.

In short:

  • A is compiled without the code needing B and the reference to B.
  • B is compiled.
  • A is recompiled.
Dykam
  • 10,190
  • 4
  • 27
  • 32
36

RBarryYoung and Dykam are onto something. Microsoft uses internal tool which uses ILDASM to disassemble assemblies, strip all internal/private stuff and method bodies and recompile IL again (using ILASM) into what is called 'dehydrated assembly' or metadata assembly. This is done every time public interface of assembly is changed.

During the build, metadata assemblies are used instead of real ones. That way cycle is broken.

Srdjan Jovcic
  • 784
  • 4
  • 9
  • 1
    Interesting answer, do you have any links? – H H Aug 22 '09 at 18:11
  • I am trying to find external reference to the tool. I do not think it is published outside Microsoft, but concept is simple: disassemble-strip internals-reassemble. – Srdjan Jovcic Aug 23 '09 at 01:42
  • Agreed - interesting answer. Some links to back this up would be good. – Drew Noakes Aug 24 '09 at 19:03
  • Yes, that's indeed the way it is done (from personal experience). – Pavel Minaev Oct 06 '09 at 17:12
  • If they would indeed strip all internal and private stuff, it would not be possible to decompile it, see what the private classes are, what others they reference etc. This is however very well possible with the exception of methods decorated with the attribute `MethodImpl(MethodImplOptions.InternalCall)`, which are only a few. Perhpaps I misunderstood your point. A link that backs up your story could be helpful, too. – Abel Nov 01 '09 at 17:01
  • True. We do not need internal and private stuff in assembly A to be able to reference its publics when compiling assembly B. You point about InternalCall methods stands, but is not relevant to this discussion. Sorry, but there is no link. In my knowledge, this is not documented publicly. – Srdjan Jovcic Nov 05 '09 at 00:33
  • But these are strongly signed assemblies. These dehydrated metadata-only assemblies would have a different cryptographic hashes. Perhaps the compiler doesn't check this, instead it's a runtime concept only. Or maybe it's a special compiler too. – Drew Noakes Nov 09 '09 at 13:33
  • 1
    They are not strongly signed until after build (they are delay signed), so dehydrated assemblies are not signed. – Srdjan Jovcic Nov 09 '09 at 18:23
27

It can be done the way Dykam described but Visual Studio blocks you from doing it.

You'll have to use the command-line compiler csc.exe directly.

  1. csc /target:library ClassA.cs

  2. csc /target:library ClassB.cs /reference:ClassA.dll

  3. csc /target:library ClassA.cs ClassC.cs /reference:ClassB.dll


//ClassA.cs
namespace CircularA {
    public class ClassA {
    }
}


//ClassB.cs
using CircularA;
namespace CircularB {
    public class ClassB : ClassA  {
    }
}


//ClassC.cs
namespace CircularA {
    class ClassC : ClassB {
    }
}
Alfred Myers
  • 6,384
  • 1
  • 40
  • 68
  • You can do this in Visual studio too though it's quite harsh to too, the basic way is to use #if's and remove the reference using the solution explorer, reversing that in the third step. A other way i'm thinking of is a third project file including the same files but different references. This would work as you can specify the build order. – Dykam Aug 22 '09 at 18:04
  • As far as i know, can't test it here. – Dykam Aug 22 '09 at 18:04
  • I would really like to see that. From what I experimented here, the moment you try to Add Reference, the IDE stops you. – Alfred Myers Aug 22 '09 at 18:06
  • I know. But a third project not having that reference AND the #if symbol, and be referenced by the second, which is referenced by the first. No cycle. But the third uses the code of the first and outputs to the first assembly location. an assembly can easily be replaced by another with the same specs. But I think strongnaming can cause a problem in this method. – Dykam Aug 22 '09 at 18:17
  • It's a little like Srdjan's answer, though a different method. – Dykam Aug 22 '09 at 18:18
19

Its pretty easy to do in Visual Studio as long as you don't use project references... Try this:

  1. Open visual studio
  2. Create 2 Class Library projects "ClassLibrary1" & "ClassLibrary2".
  3. Build
  4. From ClassLibrary1 add a reference to ClassLibrary2 by browsing to the dll created in step 3.
  5. From ClassLibrary2 add a reference to ClassLibrary1 by browsing to the dll created in step 3.
  6. Build again (Note: if you make changes in both projects you would need to build twice to make both references "fresh")

So this is how you do it. But seriously... Don't you EVER do it in a real project! If you do, Santa wont bring you any presents this year.

JohannesH
  • 6,430
  • 5
  • 37
  • 71
6

I guess it could be done by starting with an acyclic set of assemblies and using ILMerge to then coalesce the smaller assemblies into logically related groups.

Steve Gilham
  • 11,237
  • 3
  • 31
  • 37
4

Well, I've never done it on Windows, but I have done it on a lot of the compile-link-rtl environments that served as the practical progenitors for it. What you do is first make stub "targets" without the cross-references then link, then add the circular references, then re-link. The linkers generally do not care about circular refs or following ref chains, they only care about being able to resolve each reference on it's own.

So if you have two libraries, A and B that need to reference each other, try something like this:

  1. Link A without any refs to B.
  2. Link B with refs to A.
  3. Link A, adding in the refs to B.

Dykam makes a good point, It's compile, not link in .Net, but the principle remains the same: Make your cross-referenced sources, with their exported entry points, but with all but one of them having their own references to the others stubbed out. Build them like that. Then, unstub the external references and rebuild them. This should work even without any special tools, in fact, this approach has worked on every operating system that I have ever tried it on (about 6 of them). Though obviously something that automates it would be a big help.

RBarryYoung
  • 55,398
  • 14
  • 96
  • 137
  • the theorem is right. However in the .Net world, linking is done dynamic and not a problem. It's the compilation step where this solution is needed. – Dykam Aug 22 '09 at 17:48
  • Sorry to fix you again :P. But the referencing(linking) at compile time happens in the .Net world, which is everything which is derived from that specific ECMA spec. Thus Mono, dotGnu and .Net. Not Windows itself. – Dykam Aug 23 '09 at 17:47
1

One possible approach is to use conditional compilation (#if) to first compile a System.dll that doesn't depend on those other assemblies, then compile the other assemblies, and at last recompile System.dll to include the parts depending on Xml and Configuration.

Daniel
  • 15,944
  • 2
  • 54
  • 60
  • 1
    Unfortunately this doesn't allow you to conditionally reference an assembly (I wish it was possible, it would really help in one of my projects...) – Thomas Levesque Aug 22 '09 at 18:24
  • 1
    Conditional references can be easily done by editing the .csproj file. Just add a Condition attribute to the element. – Daniel Aug 22 '09 at 21:15
0

Technically, it's possible that these were not compiled at all, and assembled by hand. These are low level libraries, after all.

tylermac
  • 499
  • 2
  • 8
  • Not really. There isn't many low level stuff in it , only basic. What made you think it would be low level? The runtime and corlib is low level. Relatively. Still plain C or C++, thought the JIT contains low level stuff. – Dykam Oct 27 '09 at 16:29
0

Agreed. asmmeta.exe is like ildasm, but omits all the IL (just ret) and some privates, though sometimes privates are needed like for struct sizes.

The more general idea is that of a multi-pass build, which Microsoft has relied on heavily forever.

The stripped down ildasm output can be thought as as "header" file, in a system that does not really have them.

First visit each directory (with massive parallelism!) running ilasm. Then visit each directory (again with massive parallelism) running csc. After csc, in the same pass, run the like-ildasm tool, output back the original "headers". Compare them. If there are any mismatches, the build is broken. A developer failed to update the header. It is too late to just patch it up, without restarting the build (perhaps with a proper dependency graph, most directories will not be affected).

This is also a way to upgrade versions easily. The like-ilasm code can have names for version numbers. Though this is really a minor outcome of a multi-pass build.

Jay K
  • 129
  • 3