White-Box Testing is Fragile

White-box tests expose the internals of your code, instead of just the behavior. In this article I argue why this kind of testing may lead to fragile tests (the ones that break even if no bugs were introduced).

Black-Box vs White-Box Testing

Regarding the knowledge of the inner workings of the code, tests have two categories:

  • Black-box testing: you have no information on how the system is implemented. In an unit test, for example, you’ll only know the public methods' signatures. You just test the behavior of the system: you provide input and check the output.

  • White-box testing: you know how the system works internally. You may also have access for private members of classes (manipulating the objects in ways you couldn’t in productive code). For example, you may have access to a private field representing an internal state of the object. Instead of testing the external behavior of the class calling different methods, you may access this field directly to check if the object is in the state you expect. If your testing framework does not provide access to private members, you may create a method to expose this field (even if this method is only used by your test code).

I can’t remember once I needed white-box testing, although I acknowledge there may be very specific uses for it. I agree with Kent Beck (on his Test-Driven Development book):

Wishing for white box testing is not a testing problem, it is a design problem.

When you are stuck trying to test something and you think you will need white-box testing, I suggest the following procedure (making pauses and trying to black-box test/improve your design after each step):

  • Drink a glass of water or a cup of coffee (consider including small talk with a coworker)
  • Read one or two blog posts (like this one)
  • Discuss with a colleague the issues you’re having regarding the tests

If it does not work, just go on with the white-box testing. Maybe you’ll feel miserable, but one day you may think in a better solution (and will happily refactor your code).

Just don’t mistake “I didn’t find a solution for black-box testing” with “White-box testing was necessary”.

Unnecessary White-Box Tests

It may be the case you created a white-box test because you couldn’t find a better solution today. You really striven. You did your best. At the end, you just decided to move on.

Spending no effort trying to have black-box tests is a completely different situation. You will expose internals of your objects and most likely you will end up creating fragile tests: the one’s that will fail easily when adding new functionality or refactoring code- even if the system continues to work as expected.

I’ll show you an example of how a white-box test can be fragile.

Example

Consider the example of the picture below (code available in Gitlab):

Initial Class Diagram - Passing Tests

We have the classes JCharacterImpl and JDigit. The first for holding any character and the other is a specialization for holding digits from 0 to 9. Both implement the interface JCharacter.

JCharacterImpl and JDigit are created by a factory:


  public JCharacter create(char character) {
  
    JCharacter result;
  
    if (java.lang.Character.toString(character).matches("[0-9]")) {
      result = new JDigit(character);
    } else {
      result = new JCharacterImpl(character);
    }

    return result;

  }
  

We also have the classes JCharacterSequence and JDigitSequence, that hold a sequence of the previous classes.JDigitSequence just ignores anything that is not a digit, as you can see in the tests below (using JUnit 5):


  public class DigitSequenceTest {

    private static JCharacterSequence digitSequence;

    @BeforeAll
    static void setupClass() {
      JCharacterFactory factory = JCharacterFactory.getInstance();
      digitSequence = new JDigitSequence();
      digitSequence.add(factory.create('a'));
      digitSequence.add(factory.create('1'));
      digitSequence.add(factory.create('b'));
    }

    @Test
    void testWhiteBox() {
      for (JCharacter character : digitSequence.getCharacterSequence()) {
        Assertions.assertTrue(character instanceof JDigit);
      }
    }

    @Test
    void testBlackBox() {
      Assertions.assertEquals("1", digitSequence.getAsString());
    }

  }
  

The tests aren’t equivalent, but everything we want to check is we only have the digits in the sequence.

Notice how the white-box test checks if the objects of the sequence are of type JDigit. This test clearly depends on implementation details. Also, the method getCharacterSequence only exists because of the test. Because the developer knows how the code was implemented, he/she thought it would be simpler and a good idea to test that way.

The black-box test, on the other hand, checks the expected behavior instead of checking the internals of the object.

This code is in the master branch of Gitlab repository.

We have both these tests for the sake of the example. Consider there was a discussion between developers about which testing strategy is best. The team was divided because both seem to work.

Passing Tests

Until…

The White-Box Test Breaks

The JCharacterWrapper class was created, encapsulating the old classes and delegating the current behavior to them, as in the diagram below:

Final Class Diagram - Failing Tests

Notice there wasn’t any modification in behavior. Everything still implements the same interface as before. The only modification is that, instead of returning an instance o JCharacterImpl or JDigit, the factory now returns an instance of JDigitWrapper:


  public JCharacter create(char character) {
  
    JCharacter result;

    if (java.lang.Character.toString(character).matches("[0-9]")) {
      result = new JDigit(character);
    } else {
      result = new JCharacterImpl(character);
    }

    return new JCharacterWrapper(result);
  
  }
  

This code is in the failing-test branch of Gitlab repository.

Even if the code is still working as expected, now we have a failing test.

failing-tests.png

Why? Instead of testing the code behavior, we tested it’s structure.


  @Test
  void testWhiteBox() {
    for (JCharacter character : digitSequence.getCharacterSequence()) {
      Assertions.assertTrue(character instanceof JDigit);
    }
  } 
  

This is a fragile test.

Conclusion

Try hard not to depend on white-box tests, they will probably be fragile.