When is a class too big or too small?

Question

I recently had my code reviewed by a senior developer in my company. He criticized my design for using too many classes. I am interested to hear your reactions.

I was tasked to create a service which produces an xml file as a result of manipulating 3 other xml files. Let's name these aaa.xml, bbb.xml and ccc.xml. The service works in two phases. In phase one it scrubs aaa.xml against bbb.xml. In the second phase, it merges the product of phase one with ccc.xml to produce a final result.

I chose a design with three classes: an XmlService class which used two other classes, a scrubber class and a merger class. I kept the scrubbing and merging classes separate because the both classes were large and featured distinct logic.

I thought my approach was good because it kept my classes small and cohesive. My approach also helped to control the size of my test class.

The senior developer asserted that the scrubbing and merging classes would only be used by the XmlService class, and should therefore be part of it. He felt this would make the XMLService cohesive and this is what being cohesive means according to him. He also feels that breaking up classes this way makes them loose cohesiveness.

The irony is I tried to break these classes to achieve cohesiveness. What do you think? Who is right or wrong? Are we both right? Thank you for your suggestions.

I'm tempted to vote to close. While extremely interesting, this isn't nearly so much a question as an invitation for a discussion. I'd guess that marking it as CW is likely to reduce the chances of closure. — Jerry Coffin, Jul 21 '10 at 20:31

Humberto · Accepted Answer · 2010-07-21T20:48:16.773

13

If you follow the single responsibility principle (and based on the tone of your question, I think you do follow it), then the answer is clear:

A class is too big when it does more than one thing; and
A class is too small when it fails to fulfill its purpose.

That's very broad and indeed subjective -- hence the struggle with your colleague. On your side, you can argue:

There's absolutely no problem in creating additional classes -- It's a non-issue, compile-wise and runtime-wise.
The current implementation of the service may suggest that these classes "belong" to it, but that may change.
You can test each functionality separately.
You can apply dependency injection.
You ease the cognitive load of understanding the inner working of the service, because its code is smaller and better organized.

Furthermore, I think your boss has a misguided understanding of cohesion. Think of it as focus: the narrower the focus of your program, the higher the cohesion. If the code on your satellite classes is merged within the service class, the latter becomes less focused (less cohesive). It's generally accepted that higher cohesion is preferred over lower cohesion. Try to enlighten his/her view about it.

edited Jul 21 '10 at 20:48

answered Jul 21 '10 at 20:35

Humberto

7,117
4
31
46

1

There is a problem with creating additional classes. In some environments extra classes mean more tables which means more queries which can result in slower performance. In web MVC environments this is the most evident. Furthermore, if no other classes use these "satellite" classes it implies that the satellite classes actually are part of that singular responsibility that the original class should be handling. More code does not mean less cohesion. You should also consider that it should be easy for another developer to create a single object and call a function without worrying about phases. – Parris Jul 21 '10 at 23:00
@Parris, database access performance is another topic which doesn't seem to apply in the specific case of the OP. About the scope of the singular responsibility, consider the *reductio ad absurdum* explanation in the answer by @David Thornley. And about developer convenience, it depends entirely on the public interfaces provided by the service -- you can have both high cohesion and ease of instantiation. – Humberto Jul 22 '10 at 00:44
1

I understand the reductio ad absurdum principal and yes division of code is absolutely necessary. In this case however I don't think it makes any sense. First of all, speaking from an OOP perspective, a class should mimic some sort of real world object. What is a ParseXML, MergeXML or a XMLService class. Parse and merge sound like function names not like classes. I suppose Parse and Merge can follow the adapter pattern in some sense. The structure described does not seem elegant :/. That being said, perhaps its the best solution in an OOP only paradigm. – Parris Jul 22 '10 at 02:46
@Parris - the real world thing was an early part of the OOP world's thinking. A good OOP design contains many objects which have absolutely no relation to anything in the real world. They are elements of a program design that manipulates information, data, and devices, some of which may have a purpose far too arcane to say it mimics any real world thing. – Warren P Jul 22 '10 at 02:56
I think I've figured out a way where both me and the senior dev could come to an agreement. The answer is inner static classes. This way, I could still have three separate classes and keep them associated with the main class. In the future if there was a need to reuse one of those classes, that could be very simple. Also, the three classes could be testable independently. Thanks for all your answers, it really helped me understand this topic more. – Seagull Aug 22 '11 at 23:42

score 8 · Answer 2 · edited Jul 21 '10 at 20:35

8

Cohesion is a measure of how strongly related is the functionality within a body of code. That said, if merging and scrubbing aren't strongly related, including them both in the same class reduces cohesion.

The Single Responsibility Principle is also worth mentioning here. Creating a class for the sole purpose of scrubbing and another for the sole purpose of merging follows this principle.

I'd say your approach is the better of the two.

edited Jul 21 '10 at 20:35

devoured elysium

101,373
131
340
557

answered Jul 21 '10 at 20:31

tQuarella

365
4
13

1

The example provided does NOT follow this principal. 1 responsibility has been split into 3. Scrubbing and merging should be the responsibility of the original class that produces the XML file. They are simply utility operations, but can't even be classified as such since they have a such a specific purpose geared towards XML. In OOP a class must be recognizable to a non-programmer. As a non-programmer scrubXML makes no sense. http://en.wikipedia.org/wiki/Object-oriented_programming – Parris Jul 21 '10 at 22:42

score 2 · Answer 3 · answered Jul 21 '10 at 21:01

What would you name the classes? ScrubXml and MergeXml are nice names. ProcessXML and ScrubAndMergeXml aren't, the first being too general and the second having a conjunction. As long as none of the classes rely at all on the internals of one or the others (i.e., low coupling), you've got a good design there.

Names are very useful in determining cohesion. A cohesive module does one thing, and therefore has a simple specific name. A module that does more than one thing is less cohesive, and needs a more general or more complicated name.

One problem with merging functionality in X into Y if X is only used by Y is the reductio ad absurdam: if your call graph is acyclic, you'll wind up with all your functionality in one class.

score 2 · Answer 4 · answered Jul 22 '10 at 02:58

As someone who is coming back from the GodClass building fest of several years in duration, and now trying very hard to avoid that mistake in the future; the error of making a 6000 to 10000 line single source unit with a single class, with over 250 methods, 200 data fields, and some methods over 100 lines, the single responsibility principle looks like something worth defending against the predilections of your unenlightened supervisor.

If however, your supervisor and you are disagreeing over a matter of whether 2000 lines of code belong in one class or three, then I think you're both closer to sane, than I was. Maybe it's a matter of scale and perspective. Some object oriented programming aficionados like a certain "Coefficient of Standalone" per class, in direct violation of the generally understood ideas about how to improve cohesion and minimize coupling.

A set of 3 classes that work well together is, objectively, a more object-oriented system, than a single class that does the same thing. one could argue that if you write an application using only one class, that the application is not really object oriented at all.

score 1 · Answer 5 · answered Jul 21 '10 at 21:15

1

If the scrubber and merger are not meaningful outside the context of the main class, then I agree with your reviewer, particularly if you've had to expose any implementation details in order to allow this separation. If you're using a language supporting nested private classes or something similar, that might be a reasonable alternative to maintain the logical separation without exposing implementation details to outside consumers of the main class.

answered Jul 21 '10 at 21:15

Dan Bryant

27,329
4
56
102

oh I didn't mention this, but the scrubber and merger are not something which is exposed. – Seagull Jul 21 '10 at 22:06
1

Good point Dan. @Safder - see http://stackoverflow.com/questions/3300051/what-are-reasons-why-one-would-want-to-use-nested-classes for a discussion on how you can both be 'right'. – Sky Sanders Jul 21 '10 at 22:21
imagine version 2 of this class adds more functionality at the top of the class. Isn't having Scrubber and Merger as separate classes just going to help you avoid a refactor in the future when your main class does turn into a godclass? If you are over eager at class partitioning, shouldn't someone be a little light in criticizing this flaw? You're too organized and you think ahead too far. :-) – Warren P Jul 22 '10 at 03:04

score 0 · Answer 6 · answered Jul 21 '10 at 20:26

This is a very subjective area, and will be different depending on coding and style guidelines, and who approves your code.

That said, if your defense of your design didn't hold up, and your senior team member still insisted on merging your classes, then you have to compromise:

You've already got the logic separated out. Write the one service class and keep the methods separate like other good design, and then write a glue method. Add some comments above each method explaining how they could easily be partitioned to multiple classes if the need arises in the future.

ouch. wouldn't it hurt to have to do that? Seriously. Can't the supervisor state his point, and leave remedial action until his point actually bears some real fruit? Which it won't. — Warren P, Jul 22 '10 at 03:03

When is a class too big or too small?

6 Answers6