Refactoring for DI on large projects

Question

I work on a large scale platform project supporting around 10 products that use our code.

So far, all of the products have been using the full functionality of our platform:
- Retrieval of configuration data from a database
- Remote file system access
- Security authorization
- Base logic (the thing we are paid to offer)

For a new product we've been asked to support a smaller subset of functionality without the infrastructure the platforms bring along. Our architecture is old (start of coding from 2005 or so) but reasonably solid.

We're confident we can do that using DI on our existing classes, but the estimated times to do so range from 5 to 70 weeks depending who you talk to.

There's a lot of articles out there that tell you how to do DI, but I coulnd't find any that tell you how to refactor for DI in the most efficient way? Are there tools that do this rather than having to go through 30.000 lines of code and having to hit CTRL+R for extacting interfaces and adding them to construcors too many times? (we have resharper if that helps) If not, what do you find is the ideal workflow to quickly achieve this?

You could start out by refactor only the small subset of the code you need for the task at hand. In that way you and your team will get the feel of DI and gather some DI experience. Since the decoupled architecture that DI motivates is very test friendly, you can use unit testing to make sure you don't break anything. — lasseeskildsen, Jan 11 '13 at 19:55
This question is probably better suited to http://programmers.stackexchange.com — Dave Hillier, Jan 12 '13 at 00:08
There are no tools for this. Think about it how you would analyze each class yourself to find out which dependencies to extract. A tool can't reliably do this analysis for you (or it will extract too much). There are tools however (such as Resharper and Code Rush) that will help you with extract method and extract class refactorings, but this will still be for a single class at the time; not one click for the complete project. — Steven, Jan 12 '13 at 09:43
For me the ideal workflow is: only change what have to be touched. If there is a bug in a class, you will need to touch that class. Before you touch that class, you will need to test it. To be able to test it, you will need to refactor it. That's a good time for refactoring. Same holds for adding features or being able to supply a smaller subset. Only touch what has to be touched. — Steven, Jan 12 '13 at 09:45
I suspected as much. Was hoping that some of this might be automated since it does seem suitable for that. Thanks for the tips. — user1970723, Jan 12 '13 at 11:41
I've just inserted a DI container into a similar sized project 42K SLOCS and it took a little over 2 weeks to complete. The project was already written with DI in mind so constructor injection was already used and appropriate interfaces were already in place. I would suggest that Mark Seeman's book [Dependency Injection in .NET](http://www.manning.com/seemann/) would be a good starting point. Perhaps you could refactor to "DIY DI" as an intermediate step then move to a proper DI container after that. That's effectively what I did and it worked well. — Jack Hughes, Jan 14 '13 at 16:55

score 3 · Answer 1 · edited May 23 '17 at 10:28

I'm assuming that you're looking to use an IoC tool like StructureMap, Funq, Ninject, etc.

In that case, the work of refactoring really starts with updating your entry points (or Composition Roots) in the codebase. This could have a large impact, especially if you're making pervasive use of statics and managing lifetime of your objects (e.g. caching, lazy loads). Once you have an IoC tool in place and it wires the object graphs, you can start to spread out your usage of DI and enjoy the benefits.

I would focus on settings-like dependencies first (which should be simple value objects) and start making resolution calls with your IoC tool. Next, look to create Factory classes and inject those to manage lifetime of your objects. It will feel like you are going backwards (and slow) until you reach the crest where most of your objects are using DI, and corollarily SRP - from there it should be downhill. Once you have better separation of concerns, the flexibility of your codebase and the speed at which you can make changes will dramatically increase.

Word of caution: don't let yourselves be fooled into thinking sprinkling a "Service Locator" everywhere is your panacea, it's actually a DI antipattern. I think you will need to use this at first but then you should finish the DI work with either constructor or setter injections and remove the Service Locator.

score 3 · Accepted Answer · answered Dec 06 '13 at 04:47

Thanks for all the replies. We’re almost a year further now and I think I can mostly answer my own question.

We of course converted only the parts of our platform that were to be reused in the new product, as lasseeskildsen points out. Since this was only a partial conversion of the code base, we went with the DIY approach to dependency injection.

Our focus was making these parts available without bringing along unwanted dependencies, not to enable unit testing of them. This makes a difference in how you approach the problem. There are no real design changes in this case.

The work involved is mundane, hence the question of how to do so quickly or even automatically. The answer is it cannot be automated, but using some keyboard shortcuts and resharper it can be done quite fast. For me this is the optimal flow:

We work across multiple solutions. We created a temporary “master” solution that contains all projects in all solution files. Though refactoring tools are not always smart enough to pick up the difference between binary and project references, this will at least make them work partially across multiple solutions.
Create a list of all dependencies you need to cut. Group them by function. In most cases we were able to tackle multiple related dependencies at once.
You’ll be making many small code changes across many files. This task is best done by a single developer, or two at most to avoid having to constantly merge your changes.
Get rid of singletons first: After converting them away from this pattern, extract an interface (resharper -> refactor -> extract interface) Delete the singleton accessor to get a list of build errors. On to step 6.
For getting rid of other references: a. Extract interface as above. b. Comment out the original implementation. This gets you a list of build errors.
Resharper becomes a big help now:
- Alt + shift + pg down/up quickly navigates broken references.
- If multiple references share a common base class, navigate to its constructor and hit ctrl + r + s (“change method signature”) to add the new interface to the constructor. Resharper 8 offers you an option to “resolve by call tree”, meaning you can make inheriting classes have their signature changed automatically. This is a very neat feature (new in version 8 it seems).
- In the constructor body assign the injected interface to a non-existing property. Hit alt + enter to select “create property”, move it to where it needs to be, and you’re done. Uncomment code from 5b.
Test! Rinse and repeat.

To make use of these classes in the original solution without major code changes, we created overloaded constructors that retrieve their dependencies through a service locator, as Brett Veenstra mentions. This may be an anti-pattern, but works for this scenario. It won’t be removed until all code supports DI.

We converted about a quarter of our code to DI like this in about 2-3 weeks (1.5 persons). One year further, and we are now switching all our code to DI. This is a different situation as the focus shifts to unit testability. I think the general steps above will still work, but this requires some actual design changes to enforce SOC.

score 1 · Answer 3 · answered Jun 03 '13 at 14:43

You asked about tools. One tool that might help in a large refactoring like this is nDepend. I've used it to help identify places to target refactoring efforts.

I hesitate to mention it, because I don't want to give the impression that a tool like nDepend is necessary to undertake this project. It is, however, helpful to visualize the dependencies in your code base. It comes with a 14-day fully functional trial, which might be enough for your needs.

score 0 · Answer 4 · answered Jan 16 '13 at 10:34

Don't think there is any tool to do this conversion of code.

Because ->

Using DI in existing code base would involve,

use of interface / abstract class. Again here right chioce should be taken to ease the conversion keeping DI principle & code functionality in mind.
Effective segregation / unification of existing classes in multiple / Single classes respetively to keep code modular or small resuable units.

Dominic Zukiewicz · Answer 5 · 2013-05-11T11:11:03.700

The way I approach a conversion is to look at any part of the system that permanently modifies state; files, database, external content. Once changed and re-read, has it changed for good? This is the first place to look to change it.

So the first thing you do, is find a place that modifies a source like this:

class MyXmlFileWriter
{
   public bool WriteData(string fileName, string xmlText)
   {   
      // TODO: Sort out exception handling
      try 
      {
         File.WriteAllText(fileName, xmlText);  
         return true; 
      } 
      catch(Exception ex) 
      { 
         return false; 
      }
   }
}

Secondly, you write a unit test to make sure you aren't breaking the code while refactoring.

[TestClass]
class MyXmlWriterTests
{
   [TestMethod]
   public void WriteData_WithValidFileAndContent_ExpectTrue()
   {
      var target = new MyXmlFileWriter();
      var filePath = Path.GetTempFile();

      target.WriteData(filePath, "<Xml/>");

      Assert.IsTrue(File.Exists(filePath));
   }

   // TODO: Check other cases
}

Next, Extract an Interface from the original class:

interface IFileWriter
{
   bool WriteData(string location, string content);
}

class MyXmlFileWriter : IFileWriter 
{ 
   /* As before */ 
}

Re-run the tests and hope all is good. Keep the original test as it is checking your older implementation works.

Next write a fake implementation that does nothing. We only want to implement a very basic behaviour here.

// Put this class in the test suite, not the main project
class FakeFileWriter : IFileWriter
{
   internal bool WriteDataCalled { get; private set; }

   public bool WriteData(string file, string content)
   {
       this.WriteDataCalled = true;
       return true;
   }
}

Then unit test it...

class FakeFileWriterTests
{
   private IFileWriter writer;

   [TestInitialize()]
   public void Initialize()
   {
      writer = new FakeFileWriter();
   }

   [TestMethod]
   public void WriteData_WhenCalled_ExpectSuccess()
   {
      writer.WriteData(null,null);
      Assert.IsTrue(writer.WriteDataCalled);
   }
}

Now with it unit tested and refactored versions still working, we need to make sure that when injected, the calling class is using the interface, not the concrete version!

// Before
class FileRepository
{
   public FileRepository() { }

   public void Save( string content, string xml )
   {
      var writer = new MyXmlFileWriter();
      writer.WriteData(content,xml);
   }
}

// After
class FileRepository
{
   private IFileWriter writer = null;

   public FileRepository() : this( new MyXmlFileWriter() ){ }
   public FileRepository(IFileWriter writer) 
   {
      this.writer = writer;
   }

   public void Save( string path, string xml)
   {
      this.writer.WriteData(path, xml);
   }
}

So what did we do?

Have a default constructor which uses the normal type
Have a constructor that takes an IFileWriter type
Used an instance field to hold the referenced object.

Then its a case of writing a unit test for the FileRepository and checking that the method is called:

[TestClass]
class FileRepositoryTests
{
   private FileRepository repository = null;

   [TestInitialize()]
   public void Initialize()
   {
    this.repository = new FileRepository( new FakeFileWriter() );
   }

   [TestMethod]
   public void WriteData_WhenCalled_ExpectSuccess()
   {
       // Arrange
       var target = repository;

       // Act
       var actual = repository.Save(null,null);

       // Assert
       Assert.IsTrue(actual);
   }
}

Okay, but here, are we really testing the FileRepository or the FakeFileWriter? We're testing the FileRepository as our other tests are testing the FakeFileWriter separately. This class - FileRepositoryTests would be more useful to test the incoming parameters for nulls.

The fake is not doing anything clever - no parameter validation, no I/O. It is just sitting in so that the FileRepository can save content any work. Its purpose is two-fold; To speed up unit testing significantly and to not disrupt the state of a system.

If this FileRepository had to read the file as well, you could implement a IFileReader as well, (which is a bit extreme), or just stored the last written filePath/xml to a string in memory and retrieve that instead.

So with the basics over with - how do you approach this?

On a large project, which requires a lot of refactoring, it is always best to incorporate unit testing to any class that undergoes a DI change. In theory, your data should not be committed to hundreds of locations [within your code], but pushed through a few key locations. Locate them in the code and add an interface for them. One trick I've used is to hide each DB or index-like source behind an interface like this:

interface IReadOnlyRepository<TKey, TValue>
{
   TValue Retrieve(TKey key);
}

interface IRepository<TKey, TValue> : IReadOnlyRepository<TKey, TValue>
{
   void Create(TKey key, TValue value);
   void Update(TKey key, TValue);
   void Delete(TKey key);
}

Which sets you up to retrieve from data sources in a very generic way. You can switch from an XmlRepository to DbRepository by only replacing where it is injected. This can be extremely useful for a project migrating from one data source to another without affecting the internals of a system. It can be handful changing XML manipulation to using objects, but it is much easier to maintain and implement new functionality with this approach.

The only other advice I can give is do 1 data source at a time and stick do it. Resist the temptation to do too many in one go. If you really end up having to save to files, DB and web service in one hit, use the Extract Interface, fake the calls and return nothing. It is a real juggling act to do lots in one go, but you can slot them back in more easily than starts from first principles.

Good luck!

Your approach sounds reasonable. But wouldn't be better to use some mocking library to create a stub of the IFileWriter instead of creating a FakeFileWriter? — Asier Barrenetxea, May 10 '13 at 11:29
Yep mocking is fine. Saves you writing the unit tests for the dud implementations. Use the `CreateInstance` method again to set up the mock object and centralise the creation and necessary scenarios. — Dominic Zukiewicz, May 11 '13 at 11:06
"Have a default constructor which uses the normal type" I think it is best to avoid this if possible. The problem is that your class is still hard wired to a concrete implementation of the interface. Ideally we only want to depend on interfaces. Default constructors force concrete dependencies. — Robin, May 25 '13 at 04:32
@Robin - I agree. It should be just an intermediate step as you can imagine the code changes required to point to the other constructor or factory call. So as a compromise, if you do down this route, try adding the `Obsolete("Constructor will be deprecated shortly", false)]` to the default constructor at least get compiler warnings about it. — Dominic Zukiewicz, May 30 '13 at 10:28

score 0 · Answer 6 · answered May 25 '13 at 04:22

This book would probably be very helpful:

Working Effectively with Legacy Code - Michael C. Feathers - http://www.amazon.com/gp/product/0131177052

I would suggest starting with small changes. Gradually move dependencies to be injected through the constructor. Always keep the system working. Extract interfaces from the constructor injected dependencies and start wrapping with unit tests. Bring in tools when it makes sense. You don't have to start using dependency injection and mocking frameworks right away. You can make a lot of improvements by manually injecting dependencies through the constructor.

score 0 · Answer 7 · answered Jun 10 '13 at 14:35

What you described is a big part of the process; going through each class, created an interface, and registering it. This is most problematic if you commit right away to refactoring to the composition root, in the case of MVC that would mean assuming you were going to inject into the controller.

That could be a lot of work and if the code does a lot of direct object creation it could be very complex to try to do all at once. In these cases I think it's acceptable to use the Service Locator pattern and manually call resolve.

Start by replacing some of your direct calls to constructors with a service locator resolve call. This will lower the amount of refactoring initially needed and start giving you the benefits of DI.

Over time your calls will get closer and closer to the composition root and then you can start removing use of the service locator.

Refactoring for DI on large projects

7 Answers7