You suck at TDD #4 – External dependencies
When I started doing TDD, I thought it was pretty clear what to do with external dependencies. If your code writes to a file system – for example – you just write a file system layer (what would typically be called a façade, though I didn’t know the name of the pattern back then), and then you can mock at that layer, and write your tests.
This is a very common approach, and it mostly works in some scenarios, and because of that I see a lot of groups stick at that level. But it has a significant problem, and that problem is that it is lacking an important abstraction. This lack of abstraction usually shows up in two very specific ways:
- The leakage of complexity from the dependency into the application code
- The leakage of implementation-specific details into the application code
Teams usually don’t notice the downside of these, unless a very specific thing happens: they get asked to change the underlying technology. Their system was storing documents in the file system, and it now needs to store them in the cloud. They look at their code, and they realize that the whole structure of their application is coupled to the specific implementation. The team hems and haws, and then comes up with a 3 month estimate to do the conversion. This generally isn’t a big problem for the team because it is accepted pretty widely that changing an underlying technology is going to be a big deal and expensive. You will even find people who say that you can’t avoid it – that it is always expensive to make such a change.
If the team never ends up with this requirement, they typically won’t see the coupling nor will the see the downside of the leakage. In my earlier posts I talked about not being sensitive to certain problems, and this is a great example of that. Their lives will be much harder, but they won’t really notice.
Enter the hexagon
A long time ago in internet time, Alistair Cockburn came up with a different approach that avoids these problems, which he called the Hexagonal Architecture. The basic idea is that you segment your application into two different kinds of code – there is the application code, and then there is the code that deals with the external dependencies.
About this time, some of you are thinking, “this is obvious – everybody knows that you write a database layer when you need to talk to a database”. I’ll ask you to bear with me for a bit and keep in mind the part where if you are not sensitive to a specific problem, you don’t even know the problem exists.
What is different about this approach – what Cockburn’s big insight is – is that the interface between the application and the dependency (what he calls a “port”) should be defined by the application using application-level abstractions. This is sometimes expressed as “write the interface that you wish you had”. If you think of this in the abstract, the ideal would be to write all of the application code using the abstraction, and then go off an implement the concrete implementation that actually talks to the dependency.
What does this give us? Well, it gives us a couple of things. First of all, it typically gives us a significant simplification of the interface between the application and the dependency; if you are storing documents, you typically end up with operations like “store document, load document, and get the list of documents”, and they have very simple parameter lists. That is quite a bit simpler than a file system, and an order of magnitude simpler than most databases. This makes writing the application-level code simpler, with all of the benefits that come with simpler code.
Second, it decouples the application code from the implementation; because we defined the interface at the application level, if we did it right there are no implementation-specific details at the app layer (okay, there is probably a factory somewhere with some details – root directory, connection string, that sort of thing). That gives us the things we like from a componentization perspective, and incidentally makes it straightforward to write a different implementation of the interface in some other technology.
At this point there is somebody holding up their hand and saying, “but how are you going to test the implementation of port to make sure it works?” BTW, Cockburn calls the implementation of a port an “adapter” because it adapts the application view to the underlying dependency view, and the overall pattern is therefore known as “port/adapter”.
This is a real concern. Cockburn came up with the pattern before TDD was really big so we didn’t think about testing in the same way, and he was happy with the tradeoff of a well-defined adapter that didn’t change very often and therefore didn’t need a lot of ongoing testing because the benefits of putting the “yucky dependency code” (my term, not his) in a separate place was so significant. But it is fair to point to that adapter code and say, “how do you know that the adapter code works?”
In the TDD world, we would like to do better. My first attempt did what I thought was the logical thing to do. I had an adapter that sat on top of the file system, so I put a façade on the file system, and wrote a bunch of adapter tests with a mocked-out file system, and verified that the adapter behaved as I expected it to. Which worked because the file system was practical to mock, but would not have worked with a database system because of the problem with mocking.
Then I read something that Arlo wrote about simulators, and it all made sense.
After I have created a port abstraction, I need some way of testing code that uses a specific port, which means some sort of test double. Instead of using a mocking library – which you already know that I don’t like – I can write a special kind of test double known as a simulator. A simulator is simply an in-memory implementation of the port, and it’s generally fairly quick to create because it doesn’t do a ton of things. Since I’m using TDD to write it, I will end up with both the simulator and a set of tests that verify that the simulator behaves properly. But these tests aren’t really simulator tests, they are port contract tests.
So, I can point them at other implementations of the port (ie the ones that use the real file system or the real database), and verify that the other adapters behave exactly the way the simulator does. And that removes the requirement to test the other adapters in the traditional unit-tested way; all I care about is that all the adapters behave the same way. And it actually gives me a stronger sense of correctness, because when I used the façade I had no assurance that the file system façade behaved the same way the real file system did.
In other words, the combination of the simulator + tests has given me a easy & quick way to write application tests, and it has given me a way to test the yucky adapter code. And it’s all unicorns and rainbows from then on. Because the simulator is a real adapter, it supports other uses; you can build a headless test version of the application that doesn’t need the real dependency to work. Or you can make some small changes to the simulator and use it as an in-memory cache that sits on to of the real adapter.
Using Port/Adapter/Simulator
If you want to use this pattern – and I highly recommend it – I have a few thoughts on how to make it work well.
The most common problem people run into is in the port definition; they end up with a port that is more complex than it needs to be or they expose implementation-specific details through the port.
The simplest way to get around this is to write from the inside out. Write the application code and the simulator & tests first, and then only go and write the other adapters when that is done. This makes it much easier to define an implementation-free port, and that will make your life easier far easier.
If you are refactoring into P/A/S, then the best approach gets a little more complex. You probably have application code that has implementation-specific details. I recommend that you approach it in small chunks, with a flow like this:
- Create an empty IDocumentStore port, an empty DocumentStoreSimulator class, and an empty DocumentStoreFileSystem class.
- Find an abstraction that would be useful to the application – something like “load a document”.
- Refactor the application code so that there is a static method that knows how to drive the current dependency to load a document.
- Move the static method into the file system adapter.
- Refactor it to an instance method.
- Add the method to IDocumentStore.
- Refactor the method so that the implementation-dependent details are hidden in the adapter.
- Write a simulator test for the method.
- Implement the method in the simulator.
- Repeat steps 2-9.
Practice
I wrote a few blog posts that talk about port/adapter/simulator and a practice kata. I highly recommend doing the kata to practice the pattern before you try it with live code; it is far easier to wrap your head around it in a constrained situation than in your actual product code.
This idea is actually much older than Alistair. The SOLID principles describe exactly that, especially the Dependency Inversion principle explains that the 'client owns the abstraction'. So Alistair's ports and adapters are nothing more than a correct application of the SOLID principles.
Steven,
There is obviously a big correlation between P&A and SOLID, but I don't think it's quite as simple as you suggested. P&A is dependency inversion, but I think it feels more strongly about where the line is drawn between the application code and the dependency. You can write a file system abstraction that is SOLID – I've written those myself – and I think drawing the line at the right point is really the important part of P/A. This is especially true when you are trying to improve the design of existing code, as the putting the abstraction close to the dependency is generally the cheaper choice. There is also the significant difference between a principle and a pattern; I think that "consider P/A/S when you are dealing with dependencies" is more likely to lead to better code than "remember SOLID when you are dealing with dependencies".
While I like dependency inversion the way that P&A does it, I am not a big fan of SOLID; I've found that codebases that follow it closely end up hard to understand. I've found that the world gets simpler if you focus on whole values and no mocks (aka "as few mocks as possible"). Brian Geihsler does a good job talking about it in this post (make sure to read the update and all of the links in it, including Arlo Belshee's post about no mocks) – qualityisspeed.blogspot.com/…/why-i-dont-teach-solid.html. My experience is that the approach that Brian and Arlo talk about leads to code that is simpler to read and simpler to test. My experience is also that very few people see primitive obsession and what it costs, and some will even advocate against whole values because the resulting classes are "too small".