13

The premise of my question, in plain english:

  • A library named Foo depends on a library named Bar
  • A class within Foo extends a class within Bar
  • Foo defines properties/methods that simply pass-through to Bar
  • An application, FooBar, depends only on Foo

Consider the following sample:

class Program
{
    static void Main(string[] args)
    {
        Foo foo = Foo.Instance;

        int id = foo.Id; // Compiler is happy
        foo.DoWorkOnBar(); // Compiler is not happy
    }
}

Foo is defined as follows

public class Foo : Bar
{
    public new static Foo Instance { get => (Foo)Bar.Instance; }

    public new int Id { get => Bar.Id; }

    public void DoWorkOnBar()
    {
        Instance.DoWork();
    }
}

Bar is defined as follows

public class Bar
{
    public static Bar Instance { get => new Bar(); }

    public static int Id { get => 5; }

    public void DoWork() { }
}

The part that is completely stumping me:

Without a reference to the Bar library

  • FooBar can retrieve the ID that is provided by Bar (or at least it compiles)
  • FooBar cannot request Foo to do work that is ultimately accomplished by Bar

The compiler error associated with foo.DoWorkOnBar(); is

The type 'Bar' is defined in an assembly that is not referenced. You must add a reference to assembly 'Bar, Version 1.0.0.0, Culture=Neutral, PublicKeyToken=null' .

Why does there appear to be a disparity in the compiler?

I would have assumed that neither of these operations would compile without FooBar adding a reference to Bar.

Matt
  • 1,674
  • 2
  • 16
  • 34
  • because the property is defined directly on foo so the compiler is happy. But when you want to use Instance.DoWork() and DoWork is defined on bar, compiler needs to know where it can find that DoWork method. Thus it needs the reference to bar – Steve Jul 26 '18 at 20:38
  • Both `Id` and `DoWorkOnBar` are defined directly on `Foo` and request information/do work from `Bar` - I'm not quite following the distinction you are making here. I.e. `Bar.Id` is not defined on `Foo`, but is necessary to retrieve `Foo.Id`, but the compiler doesn't complain about that. – Matt Jul 26 '18 at 20:55
  • I would agree it doesn't look right, especially since, with only the property access (and hence successful compilation with only `Foo` referenced) the application fails at runtime due to the `Bar` assembly being missing. It would appear that the reference to `Bar` is required in both cases, but the compiler is only flagging the issue in the method call case. – Iridium Jul 26 '18 at 21:20
  • 1
    Does issue persist if your reduce implementation to `public class Bar { }` and `public class Foo : Bar { public static Foo Instance => null; public int Id => 42; public DoWorkOnBar() { } }`? Wild guess: it have something to do that methods can be overloaded while properties can not. – user4003407 Jul 26 '18 at 22:38
  • 1
    The smallest `Foo` that you can use to repro the issue is `public class Foo : Bar { public static int P => 0; public static int M() => 0; }`. Invoking `Foo.P` is no problem; invoking `Foo.M()` makes the compiler clamor for `Bar`. Why? Now that would probably take a compiler writer to explain further. (But @PetSerAl's guess of "blame overload resolution" is a safe one, given that this is by far the most complicated part of the language, and has lots of interesting dark corners.) – Jeroen Mostert Jul 27 '18 at 11:02
  • 1
    @Matt I am suggesting you to [edit] your question and reduce `Foo` and `Bar` implementation to minimal necessary version to reproduce the error, so others do not get in wrong track because of extra distraction. – user4003407 Aug 01 '18 at 13:25
  • 1
    Thanks for this excellent question. Had a hard time finding this, because it's even hard to search for the problem :D – v01pe Jun 30 '21 at 11:13

1 Answers1

5

First, note that the implementations of Foo.Id and Foo.DoWorkOnBar are irrelevant; the compiler treats foo.Id and foo.DoWorkOnBar() differently even if the implementations don’t access Bar:

// In class Foo:
public new int Id => 0;
public void DoWorkOnBar() { }

The reason that foo.Id compiles successfully but foo.DoWorkOnBar() doesn’t is that the compiler uses different logic¹ to look up properties versus methods.

For foo.Id, the compiler first looks for a member named Id in Foo. When the compiler sees that Foo has a property named Id, the compiler stops the search and doesn’t bother looking at Bar. The compiler can perform this optimization because a property in a derived class shadows all members with the same name in a base class, so foo.Id will always refer to Foo.Id, no matter what members might be named Id in Bar.

For foo.DoWorkOnBar(), the compiler first looks for a member named DoWorkOnBar in Foo. When the compiler sees that Foo has a method named DoWorkOnBar, the compiler continues searching all base classes for methods named DoWorkOnBar. The compiler does this because (unlike properties) methods can be overloaded, and the compiler implements² the overload resolution algorithm in essentially the same way it’s described in the C# specification:

  1. Start with the “method group” consisting of the set of all overloads of DoWorkOnBar declared in Foo and its base classes.
  2. Narrow the set down to “candidate” methods (basically, the methods whose parameters are compatible with the supplied arguments).
  3. Remove any candidate method that is shadowed by a candidate method in a more derived class.
  4. Choose the “best” of the remaining candidate methods.

Step 1 triggers the requirement for you to add a reference to assembly Bar.

Could a C# compiler implement the algorithm differently? According to the C# specification:

The intuitive effect of the resolution rules described above is as follows: To locate the particular method invoked by a method invocation, start with the type indicated by the method invocation and proceed up the inheritance chain until at least one applicable, accessible, non-override method declaration is found. Then perform type inference and overload resolution on the set of applicable, accessible, non-override methods declared in that type and invoke the method thus selected.

So it seems to me that the answer is “Yes”: a C# compiler could theoretically see that Foo declares an applicable DoWorkOnBar method and not bother looking at Bar. For the Roslyn compiler, however, this would involve a major rewrite of the compiler’s member lookup and overload resolution code—probably not worth the effort given how easily developers can resolve this error themselves.


TL;DR — When you invoke a method, the compiler needs you to reference the base class assembly because that’s the way the compiler was implemented.


¹ See the LookupMembersInClass method of the Microsoft.CodeAnalysis.CSharp.Binder class.

² See the PerformMemberOverloadResolution method of the Microsoft.CodeAnalysis.CSharp.OverloadResolution class.

Michael Liu
  • 52,147
  • 13
  • 117
  • 150
  • I am agree with assertion that it is caused by needing of overload resolution, but still does it really necessary to look into base class definition if any of derived class overloads applicable? AFAIK overloads defined in derived class are strictly preferred even if overload defined in base class is better match by signature. For example: https://ideone.com/SBpJuD Here it call `long` overload in derived class even that `int` overload in base class match better. – user4003407 Aug 01 '18 at 19:24
  • @PetSerAl: You're probably right. Looking at the [source code](https://github.com/dotnet/roslyn/blob/87332a897eeea6f05a2b2ef70ae18a9fba1f29d6/src/Compilers/CSharp/Portable/Binder/Semantics/OverloadResolution/OverloadResolution.cs), it looks like the compiler implements overload resolution according to the algorithm laid out in the spec: First construct the set of candidate methods (which requires looking at the base class), then discard those declared in a less derived class. – Michael Liu Aug 01 '18 at 21:21
  • Wow! I banged my head against the wall, because it didn't make sense to me. Now I understand it, but am angry, that is doesn't work :D – v01pe Jun 30 '21 at 11:11