68

What Makes a Good Unit Test? says that a test should test only one thing. What is the benefit from that?

Wouldn't it be better to write a bit bigger tests that test bigger block of code? Investigating a test failure is anyway hard and I don't see help to it from smaller tests.

Edit: The word unit is not that important. Let's say I consider the unit a bit bigger. That is not the issue here. The real question is why make a test or more for all methods as few tests that cover many methods is simpler.

An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.

Alex B
  • 24,678
  • 14
  • 64
  • 87
iny
  • 7,339
  • 3
  • 31
  • 36
  • Well, you may not catch a bug in your code that happens only when you add and do not remove. – Dave DuPlantis Oct 24 '08 at 20:37
  • Because if it tests multiple things, it would be called a plethora test. – tchen Sep 01 '10 at 09:02
  • The answer to "Do you think unit tests are the bomb?" usually reduces to the question "How good are you at mocks and code architecture?". If you're not able to break your code down into individual units to test (mock out inputs and outputs, and only run the code you're testing), then unit tests simply wont fit. You'll find yourself writing the same setups / teardowns again and again and they'll take forever to run. – TheGrimmScientist Aug 07 '17 at 02:28

17 Answers17

86

Testing only one thing will isolate that one thing and prove whether or not it works. That is the idea with unit testing. Nothing wrong with tests that test more than one thing, but that is generally referred to as integration testing. They both have merits, based on context.

To use an example, if your bedside lamp doesn't turn on, and you replace the bulb and switch the extension cord, you don't know which change fixed the issue. Should have done unit testing, and separated your concerns to isolate the problem.

Update: I read this article and linked articles and I gotta say, I'm shook: https://techbeacon.com/app-dev-testing/no-1-unit-testing-best-practice-stop-doing-it

There is substance here and it gets the mental juices flowing. But I reckon that it jibes with the original sentiment that we should be doing the test that context demands. I suppose I'd just append that to say that we need to get closer to knowing for sure the benefits of different testing on a system and less of a cross-your-fingers approach. Measurements/quantifications and all that good stuff.

MrBoJangles
  • 12,127
  • 17
  • 61
  • 79
  • Why does it matter to know everything at once? I can fix a failure and then run the test again to get the next one. – iny Oct 24 '08 at 19:55
  • 2
    "Unit" testing, by definition tests a unit of your program (i.e. one piece) at a time. – wprl Oct 24 '08 at 19:57
  • 1
    Absolutely, you can do it that way if it works for you. I'm not easily given to methodologies. I just do what works in the context. – MrBoJangles Oct 24 '08 at 20:05
  • @iny - Sure but if it takes 30 minutes to execute the test run then you may want a more thourough test report and fix a bunch at the same time – Newtopian Oct 24 '08 at 20:22
  • @Newtopian - Running only the failed test is quite simple. – iny Oct 24 '08 at 20:24
  • When you write narrowly-focused unit tests, it can be quicker to find and fix defects in the code. It is also easier to measure your progress; if you are testing 50 features, it is more meaningful to show that 30 tests in 50 pass than to show that 1 test in 5 passes. – Dave DuPlantis Oct 24 '08 at 20:36
  • As a rule of thumb, that sounds pretty reasonable. But that assertion, like most, can legitimately be answered with "it depends". – MrBoJangles Oct 24 '08 at 21:35
  • I think I'll be pedantic and point out a unit is a collection or container of 'things'. Move as a single unit, unit of work and so on. – Chris S Feb 27 '09 at 12:57
  • I agree with this answer. Interestingly enough if you write smaller tests that test only one thing you end up gravitating towards a design that is easier to test single things... which tends to require less setup state, which ends up being much faster than larger integration tests and providing much more confidence. I think a realistic goal should be about 80,000 tests running in under 10 minutes. Which is to say the vast majority of your tests should run in under 10ms each with fewer (but important) integration tests mixed in as well. None running longer than 10 seconds. Ration tests by time. – justin.m.chase Oct 17 '12 at 22:38
78

I'm going to go out on a limb here, and say that the "only test one thing" advice isn't as actually helpful as it's sometimes made out to be.

Sometimes tests take a certain amount of setting up. Sometimes they may even take a certain amount of time to set up (in the real world). Often you can test two actions in one go.

Pro: only have all that setup occur once. Your tests after the first action will prove that the world is how you expect it to be before the second action. Less code, faster test run.

Con: if either action fails, you'll get the same result: the same test will fail. You'll have less information about where the problem is than if you only had a single action in each of two tests.

In reality, I find that the "con" here isn't much of a problem. The stack trace often narrows things down very quickly, and I'm going to make sure I fix the code anyway.

A slightly different "con" here is that it breaks the "write a new test, make it pass, refactor" cycle. I view that as an ideal cycle, but one which doesn't always mirror reality. Sometimes it's simply more pragmatic to add an extra action and check (or possibly just another check to an existing action) in a current test than to create a new one.

user247702
  • 23,641
  • 15
  • 110
  • 157
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 7
    As ever Jon, you might be out on a limb, but you are talking sense from that branch you chose as your perch. – David Arno Oct 24 '08 at 20:05
  • 1
    I do agree with your point: while a best practice may be to test only one feature per test, your environment may dictate that you test multiple features. – Dave DuPlantis Oct 24 '08 at 20:39
  • 2
    Words mean something: a unit test should test one unit of the program. One method, one function. Integration and functional tests (which can be automated!) test bigger blocks. I also downvoted as the questioner seemed to already have the answer in mind, and ignored the answer with more upvotes. – Terry G Lorber Feb 04 '09 at 22:10
  • 6
    @Terry: That sounds lovely in theory, but in my view that doesn't end up working 100% of the time in practice. If, in *some* cases, you end up with simpler, smaller code by testing two actions in a single test case, where's the *practical* benefit in not doing so? – Jon Skeet Feb 04 '09 at 22:42
  • 1
    @Jon: In practice, I've found it easier to test small chunks, YMMV. Nothing works 100%, so, choose wisely. I'd add this as a con to not doing proper unit tests: The demands of writing unit testable code can benefit the design of the software (abstraction, encapsulation, short methods, etc.) – Terry G Lorber Feb 10 '09 at 03:48
  • @Terry: I like to test small chunks *where I can* - but I'd rather test large chunks than not test at all, and I'd rather have large chunks with code which I'm confident in than no code at all because I can't get one particular bit into small tests. Aim high, but be practical. – Jon Skeet Feb 10 '09 at 06:30
  • @Jon: The list example in the Q. describes many assertions within one test method. Which still sounds like a unit test to me, nothing wrong with setup reuse. I agree, better a large test, e.g., CPPUNIT_ASSERT( true == irs.paidTaxesLastFiveYears()), then no test. Semantics of assertion vs. test? – Terry G Lorber Feb 10 '09 at 13:27
  • 1
    I wouldn't like to talk too much about the semantics of an assertion vs a test. I'd certainly rather have a test with many assertions than a test which did many different things... but sometimes I find the latter is unavoidable :( – Jon Skeet Feb 10 '09 at 13:40
  • One problem with testing in large chunks is that then the tests provide less documentation value: a short name on a large test can not possibly describe the writer's full intent, which in turn requires the reader to reverse-engineer the writer's intent from the test code and implementation. See my answer and the article which is linked there. – Esko Luontola Feb 27 '10 at 23:35
  • @Esko: I'm not suggesting *enormous* tests - just not "one assertion per test". There's a happy medium to be found. – Jon Skeet Feb 28 '10 at 07:14
  • It's good that you're not suggesting enormous tests, but showing some code would make it more clear what you mean - one person's "small" can be some one else's "big". The "one assertion per test" can be interpreted many ways. The rigid literal interpretation "one assert statement per test" is not always practical, because individual assert statements are very low level and sometimes you can't check a value with just one assert, but the higher-level interpretation "one concept per test" is more important. See Clean Code pages 130-132. – Esko Luontola Feb 28 '10 at 07:50
  • @Esko: The kind of code where this really makes a difference is business code with tricky setup etc. Samples such as collections end up favouring very small tests by their very nature - only *real* code really matters, and that's usually confidential. I still believe there's a lot of room for a range of sizes of test - some very low level, and some relatively high level (even if they're still short of being full acceptance tests). – Jon Skeet Feb 28 '10 at 07:58
  • Well.. this goes to the same issue of each test should have only one assert. I believe the issue is that each test is supposed to be atomic. If you have more than one assert, you don't have an atomic test anymore, since you only test anything up to the first failing assert. – txwikinger Jul 02 '10 at 20:08
  • @txwikinger: That just begs the question of why you believe each test is supposed to be atomic. I believe wholeheartedly that testing should be pragmatic. If using more than one assertion (for related things, of course) in a test makes that test easier to read and quicker to write than separating them out, I don't see much harm. Usually if the first assertion fails, that indicates a problem which would probably cause the second assertion to fail anyway. I can't believe I've lost significant amounts of time with this approach - rather I believe I've saved time and got more understandable tests. – Jon Skeet Jul 02 '10 at 20:26
  • @Jon Skeet: Well pragmatism means to know when rules apply and when not, not that you don't have sensible rules. Secondly, I would call your philosophy of testing rather integration testing than unit testing, which is important too and should be capable of being done in a testing framework. Unit testing should ideally have decoupled tests, which is not the case if you assume that the second assertion will fail in any case if the first does. – txwikinger Jul 02 '10 at 20:53
  • @txwikinger: No, integration testing means something very different to me. It means testing the integration of multiple components. I'm not talking about that at all. I'm talking about asserting multiple things about either the results of one operation, or about running through one *logical* operation which may involve a few calls and assertions along the way. – Jon Skeet Jul 02 '10 at 20:57
  • @Jon Skeet: Maybe that is one of those exceptions to the rule. However, maybe your code needs refactoring due to the fact that too many things are part of one logical operation. It is always difficult to discuss such issues on a pure abstract level. IMHO it make sense to try to have unit tests designed not needing such complex buildups, since it induces the risk that such tests are more difficult to maintain when the project iterates and things are changing, in particular are refactored. As stated before, guidelines don't fit every situation and different aspects of quality often clash. – txwikinger Jul 02 '10 at 21:18
  • I have been burned by this pretty badly in the past. The thing you have to watch out for is when you share setup state between tests because it is "slow" to recreate it each time (red flag) and one of the tests does not clean up after itself properly. It is easy to regress this and what you end up with is randomly failing tests, since when run in isolation or in a different order the tests pass it can be a real nightmare to locate the bug. Test only 1 thing doesn't mean only 1 assert though. Use your discretion and stick to AAA, if you're acting after asserting then that is a red flag. – justin.m.chase Oct 17 '12 at 22:31
  • In `pytest` you can run setup only once for group of tests. I think this is just question of technology, not an excuse to make tests ugly, mixing different logic. – uhbif19 Apr 26 '19 at 10:14
  • Another con of testing multiple things is that your tests are less readable. Your tests should prove that your functionality works. When I'm reviewing your code, I want to see your prove your feature. If you're testing multiple things, the tests will be difficult to follow and it's harder to see if you've covered everything. This is very important also for the author when you do TDD and you're exploring your use cases. – Simon Fontana Oscarsson May 04 '23 at 20:01
  • @SimonFontanaOscarsson: Whereas if I'm writing a test for a piece of code that happens to have two outputs, I personally find it easier to read a test for both of those outputs at the same time - at least in many cases. Obviously it makes sense to write the tests as readably as possible - I'm just saying that if you *dogmatically* decide that must *always* mean testing a single thing (which I've known people interpret as *exactly one assertion per test*) then you're removing flexibility from deciding what the most readable way of expressing tests is for any given situation. – Jon Skeet May 04 '23 at 20:20
  • @SimonFontanaOscarsson: Basically, I don't believe tha dogma should replace thinking and judgement... and unfortunately I've seen that happen far too often :( – Jon Skeet May 04 '23 at 20:22
14

Tests that check for more than one thing aren't usually recommended because they are more tightly coupled and brittle. If you change something in the code, it'll take longer to change the test, since there are more things to account for.

[Edit:] Ok, say this is a sample test method:

[TestMethod]
public void TestSomething() {
  // Test condition A
  // Test condition B
  // Test condition C
  // Test condition D
}

If your test for condition A fails, then B, C, and D will appear to fail as well, and won't provide you with any usefulness. What if your code change would have caused C to fail as well? If you had split them out into 4 separate tests, you would know this.

swilliams
  • 48,060
  • 27
  • 100
  • 130
  • 1
    But writing smaller tests take longer too as one has to write more code to set it up. You can't delete without creating something. Why not do create and then delete in same test? – iny Oct 24 '08 at 19:59
  • I'm confused, what exactly is "created" and "deleted" here? It's just been my experience that when I have long, monolithic tests, I spend more time debugging _them_ than the code they test. – swilliams Oct 24 '08 at 20:04
  • This is a good discussion though, and I like that you are defending your opinion, even if I think you are wrong :) – swilliams Oct 24 '08 at 20:04
  • See the addition in the question. – iny Oct 24 '08 at 20:26
  • Actually, I'd argue quite the opposite. In the case where these conditions are serially dependent, if your test for condition A fails, you get one failure: Condition A (and the rest don't run). If you had them all independently you'd have them all fail when their setup fails. – Matthew D. Scholefield Aug 29 '21 at 04:56
11

Haaa... unit tests.

Push any "directives" too far and it rapidly becomes unusable.

Single unit test test a single thing is just as good practice as single method does a single task. But IMHO that does not mean a single test can only contain a single assert statement.

Is

@Test
public void checkNullInputFirstArgument(){...}
@Test
public void checkNullInputSecondArgument(){...}
@Test
public void checkOverInputFirstArgument(){...}
...

better than

@Test
public void testLimitConditions(){...}

is question of taste in my opinion rather than good practice. I personally much prefer the latter.

But

@Test
public void doesWork(){...}

is actually what the "directive" wants you to avoid at all cost and what drains my sanity the fastest.

As a final conclusion, group together things that are semantically related and easilly testable together so that a failed test message, by itself, is actually meaningful enough for you to go directly to the code.

Rule of thumb here on a failed test report: if you have to read the test's code first then your test are not structured well enough and need more splitting into smaller tests.

My 2 cents.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Newtopian
  • 7,543
  • 4
  • 48
  • 71
  • If the test framework can pinpoint the location of failure in a test with multiple assertions, that goes a long way in easing the stricture of unit testing. I really can go either way here as far as your examples above are concerned. – MrBoJangles Oct 24 '08 at 20:12
  • "Single unit test test a single thing is just as good practice as single method does a single task." Funny you say that. Because you need to have very clean functions / code to make good testing possible. – TheGrimmScientist Aug 07 '17 at 02:24
8

Think of building a car. If you were to apply your theory, of just testing big things, then why not make a test to drive the car through a desert. It breaks down. Ok, so tell me what caused the problem. You can't. That's a scenario test.

A functional test may be to turn on the engine. It fails. But that could be because of a number of reasons. You still couldn't tell me exactly what caused the problem. We're getting closer though.

A unit test is more specific, and will firstly identify where the code is broken, but it will also (if doing proper TDD) help architect your code into clear, modular chunks.

Someone mentioned about using the stack trace. Forget it. That's a second resort. Going through the stack trace, or using debug is a pain and can be time consuming. Especially on larger systems, and complex bugs.

Good characteristics of a unit test:

  • Fast (milliseconds)
  • Independent. It's not affected by or dependent on other tests
  • Clear. It shouldn't be bloated, or contain a huge amount of setup.
Bealer
  • 864
  • 9
  • 16
6

Using test-driven development, you would write your tests first, then write the code to pass the test. If your tests are focused, this makes writing the code to pass the test easier.

For example, I might have a method that takes a parameter. One of the things I might think of first is, what should happen if the parameter is null? It should throw a ArgumentNull exception (I think). So I write a test that checks to see if that exception is thrown when I pass a null argument. Run the test. Okay, it throws NotImplementedException. I go and fix that by changing the code to throw an ArgumentNull exception. Run my test it passes. Then I think, what happens if it's too small or too big? Ah, that's two tests. I write the too small case first.

The point is I don't think of the behavior of the method all at once. I build it incrementally (and logically) by thinking about what it should do, then implement code and refactoring as I go to make it look pretty (elegant). This is why tests should be small and focused because when you are thinking about the behavior you should develop in small, understandable increments.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
tvanfosson
  • 524,688
  • 99
  • 697
  • 795
  • This is a great answer. Unit tests aid test-driven development. That's an excellent argument for unit tests. – MrBoJangles Oct 24 '08 at 20:14
  • I hadn't really thought about, but yes. Testing only one thing (or small things) does make TDD possible. If your tests were large, TDD would be an abysmal way to write software. – tvanfosson Oct 24 '08 at 20:36
4

Having tests that verify only one thing makes troubleshooting easier. It's not to say you shouldn't also have tests that do test multiple things, or multiple tests that share the same setup/teardown.

Here should be an illustrative example. Let's say that you have a stack class with queries:

  • getSize
  • isEmpty
  • getTop

and methods to mutate the stack

  • push(anObject)
  • pop()

Now, consider the following test case for it (I'm using Python like pseudo-code for this example.)

class TestCase():
    def setup():
        self.stack = new Stack()
    def test():
        stack.push(1)
        stack.push(2)
        stack.pop()
        assert stack.top() == 1, "top() isn't showing correct object"
        assert stack.getSize() == 1, "getSize() call failed"

From this test case, you can determine if something is wrong, but not whether it is isolated to the push() or pop() implementations, or the queries that return values: top() and getSize().

If we add individual test cases for each method and its behavior, things become much easier to diagnose. Also, by doing fresh setup for each test case, we can guarantee that the problem is completely within the methods that the failing test method called.

def test_size():
    assert stack.getSize() == 0
    assert stack.isEmpty()

def test_push():
    self.stack.push(1)
    assert stack.top() == 1, "top returns wrong object after push"
    assert stack.getSize() == 1, "getSize wrong after push"

def test_pop():
    stack.push(1)
    stack.pop()
    assert stack.getSize() == 0, "getSize wrong after push"

As far as test-driven development is concerned. I personally write larger "functional tests" that end up testing multiple methods at first, and then create unit tests as I start to implement individual pieces.

Another way to look at it is unit tests verify the contract of each individual method, while larger tests verify the contract that the objects and the system as a whole must follow.

I'm still using three method calls in test_push, however both top() and getSize() are queries that are tested by separate test methods.

You could get similar functionality by adding more asserts to the single test, but then later assertion failures would be hidden.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ryan
  • 15,016
  • 6
  • 48
  • 50
  • First, looks to me like you are testing three methods in test_push, not one, and you still have to look at what assert failed to figure out what is wrong. And these two tests don't test as much behavior as the original combined test. So why not the combined test with a more asserts? – Sol Feb 04 '09 at 13:37
  • See post for extended explanation. – Ryan Feb 04 '09 at 21:35
4

If you are testing more than one thing then it is called an Integration test...not a unit test. You would still run these integration tests in the same testing framework as your unit tests.

Integration tests are generally slower, unit tests are fast because all dependencies are mocked/faked, so no database/web service/slow service calls.

We run our unit tests on commit to source control, and our integration tests only get run in the nightly build.

Owen Davies
  • 149
  • 1
  • 11
3

If you test more than one thing and the first thing you test fails, you will not know if the subsequent things you are testing pass or fail. It is easier to fix when you know everything that will fail.

Rob Prouse
  • 22,161
  • 4
  • 69
  • 89
3

The GLib, but hopefully still useful, answer is that unit = one. If you test more than one thing, then you aren't unit testing.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
David Arno
  • 42,717
  • 16
  • 86
  • 131
3

Smaller unit test make it more clear where the issue is when they fail.

None
  • 2,927
  • 3
  • 29
  • 42
2

Regarding your example: If you are testing add and remove in the same unit test, how do you verify that the item was ever added to your list? That is why you need to add and verify that it was added in one test.

Or to use the lamp example: If you want to test your lamp and all you do is turn the switch on and then off, how do you know the lamp ever turned on? You must take the step in between to look at the lamp and verify that it is on. Then you can turn it off and verify that it turned off.

tdahlke
  • 287
  • 1
  • 3
  • 11
2

I support the idea that unit tests should only test one thing. I also stray from it quite a bit. Today I had a test where expensive setup seemed to be forcing me to make more than one assertion per test.

namespace Tests.Integration
{
  [TestFixture]
  public class FeeMessageTest
  {
    [Test]
    public void ShouldHaveCorrectValues
    {
      var fees = CallSlowRunningFeeService();
      Assert.AreEqual(6.50m, fees.ConvenienceFee);
      Assert.AreEqual(2.95m, fees.CreditCardFee);
      Assert.AreEqual(59.95m, fees.ChangeFee);
    }
  }
}

At the same time, I really wanted to see all my assertions that failed, not just the first one. I was expecting them all to fail, and I needed to know what amounts I was really getting back. But, a standard [SetUp] with each test divided would cause 3 calls to the slow service. Suddenly I remembered an article suggesting that using "unconventional" test constructs is where half the benefit of unit testing is hidden. (I think it was a Jeremy Miller post, but can't find it now.) Suddenly [TestFixtureSetUp] popped to mind, and I realized I could make a single service call but still have separate, expressive test methods.

namespace Tests.Integration
{
  [TestFixture]
  public class FeeMessageTest
  {
    Fees fees;
    [TestFixtureSetUp]
    public void FetchFeesMessageFromService()
    {
      fees = CallSlowRunningFeeService();
    }

    [Test]
    public void ShouldHaveCorrectConvenienceFee()
    {
      Assert.AreEqual(6.50m, fees.ConvenienceFee);
    }

    [Test]
    public void ShouldHaveCorrectCreditCardFee()
    {
      Assert.AreEqual(2.95m, fees.CreditCardFee);
    }

    [Test]
    public void ShouldHaveCorrectChangeFee()
    {
      Assert.AreEqual(59.95m, fees.ChangeFee);
    }
  }
}

There is more code in this test, but it provides much more value by showing me all the values that don't match expectations at once.

A colleague also pointed out that this is a bit like Scott Bellware's specunit.net: http://code.google.com/p/specunit-net/

Dave Cameron
  • 2,110
  • 2
  • 19
  • 23
1

The real question is why make a test or more for all methods as few tests that cover many methods is simpler.

Well, so that when some test fails you know which method fails.

When you have to repair a non-functioning car, it is easier when you know which part of the engine is failing.

An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.

Let's suppose that the addition method is broken and does not add, and that the removal method is broken and does not remove. Your test would check that the list, after addition and removal, has the same size as initially. Your test would be in success. Although both of your methods would be broken.

Nicolas Barbulesco
  • 1,789
  • 3
  • 15
  • 20
1

When a test fails, there are three options:

  1. The implementation is broken and should be fixed.
  2. The test is broken and should be fixed.
  3. The test is not anymore needed and should be removed.

Fine-grained tests with descriptive names help the reader to know why the test was written, which in turn makes it easier to know which of the above options to choose. The name of the test should describe the behaviour which is being specified by the test - and only one behaviour per test - so that just by reading the names of the tests the reader will know what the system does. See this article for more information.

On the other hand, if one test is doing lots of different things and it has a non-descriptive name (such as tests named after methods in the implementation), then it will be very hard to find out the motivation behind the test, and it will be hard to know when and how to change the test.

Here is what a it can look like (with GoSpec), when each test tests only one thing:

func StackSpec(c gospec.Context) {
  stack := NewStack()

  c.Specify("An empty stack", func() {

    c.Specify("is empty", func() {
      c.Then(stack).Should.Be(stack.Empty())
    })
    c.Specify("After a push, the stack is no longer empty", func() {
      stack.Push("foo")
      c.Then(stack).ShouldNot.Be(stack.Empty())
    })
  })

  c.Specify("When objects have been pushed onto a stack", func() {
    stack.Push("one")
    stack.Push("two")

    c.Specify("the object pushed last is popped first", func() {
      x := stack.Pop()
      c.Then(x).Should.Equal("two")
    })
    c.Specify("the object pushed first is popped last", func() {
      stack.Pop()
      x := stack.Pop()
      c.Then(x).Should.Equal("one")
    })
    c.Specify("After popping all objects, the stack is empty", func() {
      stack.Pop()
      stack.Pop()
      c.Then(stack).Should.Be(stack.Empty())
    })
  })
}
Esko Luontola
  • 73,184
  • 17
  • 117
  • 128
  • The difference here is that you effectively have nested tests. The three tests about "pushed last is popped first", "pushed first is popped last" and "afterwards the stack is empty" are effectively subtests. That's quite a neat way of doing things, but not one supported by (say) JUnit and NUnit. (I don't particularly like the "let's make it all read like English", but that's a different matter.) How would you express these tests in JUnit? As 5 separate tests, or 2? (Each of the two would contain multiple assertions - optionally with messages.) – Jon Skeet Feb 28 '10 at 08:01
  • In JUnit 4 I would use a simple custom runner, so that I can use inner classes like this: http://github.com/orfjackal/tdd-tetris-tutorial/blob/beyond/src/test/java/tetris/FallingBlocksTest.java In JUnit 3 it doesn't work as nicely, but it's possible like this: http://github.com/orfjackal/tdd-tetris-tutorial/blob/8dcba9521ac8e475566619da0883abafcd8f8b14/src/test/java/tetris/FallingBlocksTest.java In a framework which does not have fixtures (such as gotest), I would grudgingly write all the same information in the name of one method. Not having fixtures produces lots of duplication. – Esko Luontola Feb 28 '10 at 16:51
  • I haven't used NUnit nor C#, but from http://www.nunit.org/index.php?p=quickStart&r=2.5.3 it appears that NUnit would natively support this style of organizing tests. Just put multiple test fixtures into the same namespace, so that in one file/namespace there are all test fixtures which relate to the same behaviour. – Esko Luontola Feb 28 '10 at 17:02
  • The best is of course, if the testing framework already supports the preferred style of writing tests. In Java I've mostly used JDave, in Scala Specs, in Ruby RSpec etc. And if nothing suitable exists, implementing one yourself can be done in a week. This was the case with Go: the only framework was gotest but it was too restricted, gospecify was under development but its author had different project goals (no isolation of side-effects), so I created GoSpec 1.0 in less than 50 hours. – Esko Luontola Feb 28 '10 at 17:28
1

Another practical disadvantage of very granular unit testing is that it breaks the DRY principle. I have worked on projects where the rule was that each public method of a class had to have a unit test (a [TestMethod]). Obviously this added some overhead every time you created a public method but the real problem was that it added some "friction" to refactoring.

It's similar to method level documentation, it's nice to have but it's another thing that has to be maintained and it makes changing a method signature or name a little more cumbersome and slows down "floss refactoring" (as described in "Refactoring Tools: Fitness for Purpose" by Emerson Murphy-Hill and Andrew P. Black. PDF, 1.3 MB).

Like most things in design, there is a trade-off that the phrase "a test should test only one thing" doesn't capture.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Maurice Flanagan
  • 5,179
  • 3
  • 30
  • 37
0

Disclaimer: This is an answer highly influenced by the book "xUnit Test Patterns".

Testing only one thing at each test is one of the most basic principles that provides the following benefits:

  • Defect Localization: If a test fails, you immediately know why it failed (ideally without further troubleshooting, if you've done a good job with the assertions used).
  • Test as a specification: the tests are not only there as a safety net, but can easily be used as specification/documentation. For instance, a developer should be able to read the unit tests of a single component and understand the API/contract of it, without needing to read the implementation (leveraging the benefit of encapsulation).
  • Infeasibility of TDD: TDD is based on having small-sized chunks of functionality and completing progressive iterations of (write failing test, write code, verify test succeeds). This process get highly disrupted if a test has to verify multiple things.
  • Lack of side-effects: Somewhat related to the first one, but when a test verifies multiple things, it's more possible that it will be tied to other tests as well. So, these tests might need to have a shared test fixture, which means that one will be affected by the other one. So, eventually you might have a test failing, but in reality another test is the one that caused the failure, e.g. by changing the fixture data.

I can only see a single reason why you might benefit from having a test that verifies multiple things, but this should be seen as a code smell actually:

  • Performance optimisation: There are some cases, where your tests are not running only in memory, but are also dependent in persistent storage (e.g. databases). In some of these cases, having a test verify multiple things might help in decreasing the number of disk accesses, thus decreasing the execution time. However, unit tests should ideally be executable only in memory, so if you stumble upon such a case, you should re-consider whether you are going in the wrong path. All persistent dependencies should be replaced with mock objects in unit tests. End-to-end functionality should be covered by a different suite of integration tests. In this way, you do not need to care about execution time anymore, since integration tests are usually executed by build pipelines and not by developers, so a slightly higher execution time has almost no impact to the efficiency of the software development lifecycle.
Dimos
  • 8,330
  • 1
  • 38
  • 37
  • A test that tests more than one thing in most cases has less code that separate tests. Testing two tightly related things together makes sure that the two things actually work together. – iny Mar 24 '17 at 07:49
  • I think though that what you are referring to slightly escapes the context of unit testing and goes towards component-level testing. When unit testing, you ideally want to test each piece of functionality completely isolated. When doing component testing, you might indeed need to test 2 different pieces of functionality together, if they provide a bigger set of functionality to a higher level in the design hierarchy. – Dimos Mar 24 '17 at 12:53