A design discussion, some options, and unidirectional models

When writing about discoverability in my last two posts, I came across the below code which violated that principle.  It was doing some meta programming in order to attempt to avoid repetition and to be more DRY.

def base_value_for(key)
    case info.send("#{key}_base")
    when 'value1'
        value1
    when 'value2'
        value2
    else
        raise ArgumentError, "Base keys must be value1 or value2"
    end
end

The caller then would be able to find the 3 base values by passing them into the base_value_for method.  The methods one_base, two_base, three_base are known and exist on the info object which is receiving the public send.  The UI has ensured they are set to either ‘value1’ or ‘value2’.  The current object has a method to return the value for value1 and the value for value2.  So this code is essentially allowing the initial base value for a calculation to be configured.

The caller would then call one of base_value_for(‘one’) OR base_value_for(‘two’) OR base_value_for(‘three’).  Calling with any other value that didn’t result in a method on the info object for key_<value> would blow up.

What is wrong with this?  Take a read here.  TL;DR – if you’re in the info class and search for ‘one_base’ in the code base there will be no search hit that tells you that this code is being called from here.  So you may assume it isn’t being called and refactor it.  The principle of discoverability is violated.

This code was written to avoid writing:

def base_for_one_base
    case info.one_base
    when 'value1'
        value1
    when 'value2'
        value2
    else
        raise ArgumentError, "Base keys must be value1 or value2"
    end
end

and repeat for two and three.

An alternative to the above non-DRY implementation is

def base_for_one_base
    base_for(info.one_base)
end

private
def base_for(info_base)
    case info_base
    when 'value1'
        value1
    when 'value2'
        value2
    else
        raise ArgumentError, "Base keys must be value1 or value2"
    end
end

Now the difference between base_for_one_base and the version for two and three is just a single line.

def base_for_two_base
    base_for(info.two_base)
end

def base_for_three_base
    base_for(info.three_base)
end

This is probably DRY enough.  We could probably stop here.

An alternate design

The existence of the info model is to hold the knowledge about the configuration.  So an alternative design could be to push the decision logic into the info model.

The problem with doing this is that the method is also using the value1 and value2 methods on the parent model.

The one solution is to Just Use It.

# info class
belongs_to: parent

def base_for_one_base
    base_for(one_base)
end

private

def base_for(info_base)
    case info_base
    when value1
        parent.value1
    when 'value2'
        parent.value2
    else
        raise ArgumentError, "Base keys must be value1 or value2"
    end
end

This sets up a cyclic dependency.  The Parent model has an info model.  The Info belongs to the parent model.  The info can be loaded and accessed by its parent directly.  The parent can be loaded and accessed by the info model directly.  But what if code were written that caused code to be accidentally called recursively between the two models?

On this small scale, it might be okay.  On a larger scale it might be much harder to reason about across several models.  And multiple models may be doing this in a highly interconnected way – and you then may have one large ball of interconnected models – or mud.

Unidirectional models as a design constraint

What would the code look like if we didn’t accept cyclic dependencies?  If all interactions with a model were always from a parent to a child.  That could make the code simpler in the long run as at least half the number of linkages would exist.  One for each direction vs. one for the one direction.

The code in this example would then look like:

# parent class
def base_for_one_base
    info.base_for_one_base(value1, value2)
end

# info class
def base_for_one_base(value1, value2)
    base_for(one_base, value1, value2)
end

private

def base_for(info_base, value1, value2)
    case info_base
    when 'value1'
        value1
    when 'value2'
        value2
    else
        raise ArgumentError, "Base keys must be value1 or value2"
    end
end

The negative here is we need to pass in the data from the parent that the child needs.  Perhaps that data shouldn’t live on the parent, but rather move to the child.  That might be a better design and our code is telling us that, but it was something we didn’t want to move yet.  However that knowledge is not exposed to the consumer.  The implementation of getting the rate for the consumer is only via the parent.

Simplicity

I favour one-way connections over bi-directional ones as they are simpler to reason about.  They result in cleaner, contained messes that are easier to understand (and potentially replace).

This contrasts against the effort required to break down the keen desire to have bi-directional connections that Rails encourages and makes so easy.  SQL data integrity also desires bidirectional connections.

It might result in jumping through some hoops to achieve.  But then the design is speaking and the constraint is possibly telling me that the design of the data might need to change.

I favour smaller, contained messes to make reasoning and understanding of the code easier.  One way to achieve that is through unidirectional models.

Advertisements

Coding with confidence

Change is inevitable. It is the one constant in software development.

I find optimising to be confident in the face of this inevitable change valuable. I want to be confident that what I’ve done – what I’m releasing – really works.

When changing software I’ve come to ask several questions that help me to be confident in the changes that I make.

  • How will I know when something breaks? Does it matter if some given functionality breaks?
  • How easy is it to understand?
  • How localised would changing this be in the future in any new direction?
  • How confident am I? How can I be more confident?

This leads me to a list of principles that I currently value when writing software.

  • Feedback
  • Simplicity
  • Changeability
  • Ease of Understanding
  • Deliberate Intent

These principles lead me towards certain practices that I currently find useful.

Feedback
In order to know when something breaks a feedback mechanism is needed. Waiting for feedback from an end user that you messed up is far, far too late. The best time for that feedback is the instant the change was made. That is the time when the knowledge is known about what the breaking change is and hence how to fix it. This means testing. Testing at the right levels. Robust testing to ensure that the feedback is as clear, direct and immediate as possible.

Emergent design and Test Drive Development combine to provide feedback. The tests confirm that the implemented requirements are still working.

In order to build something that I’m unsure of, I want to do the smallest thing possible to validate whether what I’m trying to do will work. I cannot write tests when I’m unsure of how it will be called or of how it works. I don’t want to build something to discover that what I thought I was being called with isn’t actually want I’m being called with.

Unexpected things can happen. If I don’t expect them, I may not plan for them. But often I will implement ways to log or notify the unexpected – for when the unexpected happens. For instance using logs to log unexpected errors, or tools like Airbrake (for error catching and logging) or New Relic (for real time stats).

Simplicity
Systems become complex. Large balls of mud become very complex. If one assumes entropy in software is inevitable, any system will get messy. Instead of one large mess I try to focus on creating lots of small messes. Patterns I use to help towards small, contained messes include the Aggregate pattern; valuing Composition over Inheritance; separation of IO and CPU operations; Pipes and Filters; and Anti-Corruption layers.

Emergent design helps me keep a solution simple by only implemented the required functionality and allowing the design to emerge effectively behind the specific interface.

Changeability
I try to isolate the same domain concepts using DRY. But I try not to isolate accidentally similar concepts. As Sandy Metz puts it – prefer duplication over the wrong abstraction.

Test! Tests provide me feedback on the change that is being made. The expected tests should break. Tests that break unexpectedly should be looked at to understand the unexpected coupling in the system. I try to focus on increasing cohesion over reducing coupling. Increased cohesion provides a better-defined system – while also reducing coupling.

I aim to isolate change with design practices and test practices. The goal is to test at the right level. I try to test the interface so that everything below the interface can be allowed to emerge in whatever sensible design is needed for the requirements now. I encapsulate object creation in tests to allow the creation of the object to change over time without having to change a large number of tests. The goal is that any change in any direction is as painless as possible.  This doesn’t mean predicting the change, but rather isolating the obvious changes and hence limiting their impact when they do change.

There should be no area that is too scary or painful to change. When there is the goal becomes to find ways to reduce the fear or pain – assuming it still needs to change.

Ease of understanding
The ability of someone else (or yourself in 6 months time) to understand the existing code is highly underrated. Code is write once, read many. Every time an issue occurs near a given part of code, the code may need to be understood. If it is clear and obvious and is able to be trusted, it will be quick. If not, it could add 30 minutes or more to understand the code even if the code is not relevant to the problem. The cost of not being able to read and understand the code quickly is high over the lifespan of the system when considering the number of people who will read the code over time in their quest to make changes and solve problems.

I use Behaviour Drive Design to drive what tests to write. If I can’t write the test or the code cannot be built from the test, then I do not understand what I’m doing yet. The tests describe the intent of the code.

Discoverability – how will someone else discover that the code works like this? I try to ensure that breadcrumbs exist for a future developer to follow. If the knowledge exists for a developer to follow, then they can ask the question – do I need this.

I value explicit over implicit. If developers are joining your team, how will they know how the system works? Code is obvious. Implicit rules are not. Frameworks such as Rails come with a lot of implicit rules. These rules can be very opaque no matter how experienced the new developer is. You don’t know what you don’t know. The power of Rails is the weakness of Rails. But good Rails developers can move between Rails projects and know the implicit rules. But what if there are implicit rules that are specific to the code base? Those are hard to know. Again – you don’t know what you don’t know… until you realise that you didn’t know it and someone tells you – or you go really deep. Either way a lot of time can be unnecessarily lost grappling with the unknown.

Deliberate intent
I try to make conscious, well-reasoned design decisions. My goal is to keep the exposed interface tight and deliberately defined. For any solution I try to think of at least three ways to solve the problem and then make a choice based on the pros and cons. This ensures that I’m thinking deeply about the problem instead of simply implementing the first design thought that arose in my mind.

A continued experiment
These are my principles and practices that I’m experimenting with. Over time I imagine they may change or modify as I continue to experiment and learn.

In order to experiment I generally try applying a constraint and see what it does to the code base. Do we like it? Does it make things better? How do we feel about the result? To paraphrase Kent Beck – If it doesn’t help, stop doing it. If it does, do more!

I continue to look for the experiments that I can do to make things better along with the underlying principles and values that they represent. I hope to blog about some of those experiments in the future.

There is a lot more that could be said about each of these principles. I hope to follow up with related posts digging into different implementations and designs that I have derived from these principles.

 

Credit
Credit goes to Kent Beck’s Extreme Programing Explained that focuses on Values, Principles and Practices. I find this an incredibly useful way in which to view software development.

Libraries

Libraries are useful. Using a library in order to accelerate your development is a great thing. It is fantastic if someone has implemented a generic solution that means that my team does not need to learn and implement the significant details of that solution. The business goal is usually not for a team to learn the deep details of a solution, but rather to achieve the goal. Using a solution – for example an Excel library to export some data to the Excel format – instead of writing it ourselves – meets the business goal faster (hopefully).

But is it always a better idea to use a library than to write it yourself? And how do you choose between one library and another?

Complex vs. Trivial

It feels reasonable to include a library to do something complex that would take a significant amount of time and effort to achieve.

It feels unreasonable to include a library to do something trivial that can quickly and simply be written yourself.

Adding a library introduces a new external dependency to the code base that is currently not there. It will probably introduce the dependency for much of the rest of the lifetime of this feature – which could be a significant length of time. Dependencies come with their own dependencies and those dependencies with theirs, and so on. External dependencies can cause dependency conflicts. They can cause pain when upgrading. They can no longer be supported or work in the future. For example, some Gems (Ruby library dependencies) are no longer supported when moving to Rails 3 from Rails 2 – which causes significant pain in those migrations as the dependencies need replacing – either with code written by the team, or a new dependency. Either way, lots of testing will occur.

Use a high percentage of the library

It feels unreasonable to include a large library when you only use a very small percentage of it.

Including a library that has many more reasons to change than you are using it for will cause the library to be updated and upgraded for many reasons that are unrelated to the reason why the library is in use by your code. This could cause future pain with upgrade churn with no value. The library may even be obsoleted for unrelated reasons to why you are using it.

A software decision is made at a moment in time

Using a library is a software decision made at a moment in time. At that moment the choice made might have been the correct solution to the problem. However at any moment in time in the future it may no longer be the correct solution. That is the reality of supporting changing business requirements and needs. A question that should always be asked is:

What if this decision is the wrong one? How would we refactor out of this decision?

We all know that in software, we learn more as we go. The key question with any software that we develop should be how do we refactor out of this decision if we learn more and the decision is no longer the correct one for the business needs now.

Can you get out of the decision to use a given library? How would the code look to protect against the need to stop using the library and easily replace it with another library or your own code? What would it look like if you tried to do that for a framework choice?

Isolate the External Dependency – Know Thy Seam

Micheal Feathers talks about seams in his book Working Effectively with Legacy Code (http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052). Seams are a generically useful pattern. The seam is the place that you can test around. Which also means you can replace what is behind the seam with any other code that supports the tests and everything should continue to work.

Single use dependency

If this is the first (and probably only (as known right now)) time that we will use this dependency, then wrap the usage of it in tests inside the class that uses it.

For instance if you’re using an Excel exporter library and this is the only place that will ever use it, ensure that it only is used / included in this component / class and not auto-loaded throughout the entire code base. Then ensure that the class that uses it is covered for tests for the scenarios that use the library. This class is then the seam that allows you to keep the library under control.

To continue with the Excel example – make sure you load Excel files in your tests and hence test your code’s usage of the library’s interactions and expectations of how Excel files will be represented to your code. If you do not, and the library changes, you will not know if the library has broken your code.

Multi use dependency

If this is a generic piece of code that will be used in many places, wrap it in your own wrapper, ensure that it only is used / included in the wrapper and not auto-loaded throughout the entire code base. Test the wrapper and its interactions.  The wrapper is now the seam that allows you to keep the library under control and easily replaceable.

For the Excel library example, make sure representative Excel files that you expect to interact with are covered with tests.

Don’t be naïve about the altruistic nature of library writers

The thing with frameworks and libraries is that they are written by people who are (most likely) experiencing different forces to what you are experiencing in your code base. Their decisions are dictated by the forces they are experiencing, not the forces the business you are writing code for is experiencing.

Things change over time. Decisions that are made now may change in the future. Your needs of a library may change. The direction and intent of the library may change. You may in fact be using some magic of the library in an unanticipated way, which may break for you in the future due to a library code change. Things change, things migrate and upgrade. It isn’t the intent of a library creator to mess you up, but you are making decisions about using their code while not being in control of their code. This may cause you (or more to the point the business who’s code it is) a bunch of pain in the future.   So ensure that you remain in control when their change causes you pain.

Recommended Do Not Do Suggestions…

Do not smear a library around multiple classes.  Later you may discover you can no longer use it. In Rails, many Gems provide generic ‘helpful’ solutions, like attachments that hook on to your models. But what if on migrating to a new Rails version the Gem is no longer available? If you are not in control, then a lot of rework is required to dig yourself out of the hole. I have seen this in a Rails 2 to Rails 3 migration, which resulted in significant problems with Gems that were no longer available.

Do not hook yourself so tightly to the library of choice in a lot of places that it is hard to replace. .NET comes with a reporting library which from Visual Studio 2005 to Visual Studio 2008 was completely rewritten by Microsoft and you could no longer load the old reporting components in the new VS2008. With a lot of reports and limited abstraction layer and high coupling between the .NET reporting components, a lot of code needed rewriting to migrate to the newer version. Be aware that even when you think the choice is obvious and will never change, the library writer may force you to change and prove your decision to not protect yourself to be incorrect.

The economics of the issue

Fast now always looks better than being marginally slower now for the potential faster in 2 years time. This may be a valid trade off for a company that has no money and is desperately attempting to survive. But once you’ve moved from survival mode to agility in being able to change as needed, then the trade off becomes easy. And some may argue you might want agility in survival mode just as much – or even more.

Stay in control

Keep control of your code and your dependencies. Make it easy to replace things. Any time a decision that was made is no longer a good one, refactor to replace.

And always question how you will refactor away from any given external dependency. Knowing that will ensure that your code will remain agile.

Frameworks

Frameworks are awesome. Web frameworks such as Rails, .NET MVC, Django, Node, Flask, Sinatra are powerful accelerators of software development. They provide scaffolding that you can use, but you don’t necessarily have to write or maintain.

Frameworks are alluring. They often focus on providing features that are geared to accelerate development and are eminently demoable. Developers are encouraged to use all features everywhere. Those demos never seem to discuss the problems that could arise by using the feature as the system grows over the years. The demos do not discuss how to protect the system using them against how the feature may change.

Experience

My experience over the years while working on large and small systems has correlated extensive coupling to a framework across the whole code base to almost always being a bad idea. For a small enough system it can work. But systems seem to rarely stay small without a lot of premeditated effort.

I’ve seen front-end JavaScript messes where the only way out is a complete rewrite, as the framework wants to own the world and the technology choices are no longer correct for the business requirements. I’ve seen back end .NET, Java and Ruby implementations where you’re encouraged to build systems tightly coupled to the framework.

Systems make it easier or harder to do the right thing. A system that makes it easy to do the right thing allows its developers to fall into the pit of success, no matter the strength of the developer. Frameworks far too often encourage designs that are tightly coupled to the framework. As a result, better developers are needed in order to ensure a good, sustainable design.

An alternative design

An alternative design is to keep the framework at the edges of the system. This is similar to (but possibly not as hardcore as) the hexagonal architecture / clean architecture / screaming archicture. [https://blog.8thlight.com/uncle-bob/2012/08/13/the-clean-architecture.html]

Controllers, in the Rails and .NET MVC sense, are about converting the web into API calls to your code. Views are about displaying the information that they need. Database objects – active record or inside a repository – are for accessing the database. The business logic that the controllers call, that gets presented to the view, that transforms inputs into models that could be persisted – that doesn’t need a framework. Any code written in your language that does not rely on your framework is not going to break when the framework is upgraded – or replaced.

When using a database, keep all the data access behind an API that returns single, or lists of, objects that you can use. Pass those objects to be manipulated with your system’s logic in non-framework code. Save the results back to the DB by passing objects (or hashes) to the database API.

Implementing the database code behind an interface means you can change it at will – even swapping out a database implementation if you must. It is a single seam for data access of the application. Keeping database access at the edges of your code allows optimising database calls in one place instead of having multiple calls spread throughout the business logic code. This may result in loading more data at one time initially, but deal with a memory problem when you have one, rather than smearing your logic and your framework code together across your code base.

For example:

def controller_export_method
// Get the parameters from the web
firstname = params[:firstname]
lastname = params[:lastname]
// Fetch objects from the DB
contacts = Contacts.fetch_for_name( firstname, lastname)
// Do something with the objects
results = ContactExportFormat.transform(contacts)
// Persist to the DB / export and assign the result for the view to display
@result = ContactExportLog.persist(results)

// send_file / send_data the export
end

Selectively break the rules, when you must

When you need to optimize some code due to speed or memory constraints, then potentially mix logic and database and other external accesses, but as a last resort. Lean towards enumerable objects that do lazy evaluation and yield to the business logic so that these dependencies are clear and separate. But start with clean, clear separations and muddle them up for a clear and obvious external business goal of performance.

Focused Usage

Focused functionality provided by a framework is great. For example routing to a controller is a common pattern implemented by many web frameworks. It is something that is clear and contained and testable. The goodness does a specific thing and it can be tested and proven to work.

Unfocused, extensive framework usage is a problem. The understanding of the reason for its usage can be lost. Once it is lost, it becomes incredibly difficult (if not impossible) to get it safely under control again as no one is really sure what it is doing or why it is there.

Focused usage allows change to be isolated. If the framework makes a change to how params are dealt with, then if we know we only deal with params in controllers, then we know where to look to fix it. If we’ve allowed params to be used randomly throughout the system we’re in trouble. Isolate change. And make sure the behaviours that are expected are codified in tests so we can know if the framework changes that things still work.

The Framework Moves

Frameworks move at a steady pace. Bugs are fixed, security patches are applied, features are added and deprecated. All of this happens at a reasonable rate. For most code bases that live for more than a couple of years, the framework will be able to be upgraded multiple times. The less code that is written that is coupled to the framework, the less painful that upgrade will be.

And so

If all your code inherits from a framework class, you are doing something wrong.

Limit frameworks to do well understood, localised and controllable things. These could then be replaced with some other version of the framework easily.

Frameworks that force you to design in a specific way and are smeared around your entire code base will probably cause a world of pain in the long term.   (Or lets be honest, probably not you but the business who owns the code and the poor guy who now owns the code.)

It can be faster to just do it the framework way initially. In the short term. But it might not be responsible in the long term. The economics of short term gain over potential later cost are hard. But if you expect the code to be around in more than 2 years, they should stop being hard.

Frameworks are awesome.

Held at an arms length. Under your control.

But keep in mind, non-framework code will always survive a framework change.

Component Configuration – a design conversation

A recent conversation led me to want to write down a design train of thought. Perhaps some might might find it interesting. And others could point out flaws or better principles that I missed along the way.

Background
A requirement existed for a component to expose a client interface to a web API that is also being written. This client component then could get reused by different applications consuming the web API.

As the web API and client component are both developed by the same developers, it would be desirable to stand up different instances of the web API- such as staging and production – and to have different instances of the application consuming the component to talk to those different web API instances to test staging or consume the production installation.

The design question

Where do we define the URL that the component should talk to?

Do we encapsulate the knowledge about the URL(s) in the component?
Or do we pass the URL to call to the component?

What if the component knows about about the URL directly?
The application then just calls the component and the component just talks to the right url and magic happens. That is pretty awesome! I love the fact that my consuming applications don’t need to care. But…

But I now have unexpected coupling to the application’s deployment strategy
What does that mean for deployment? When I deploy the application to development, staging, UAT or production there might be different URLs that the component needs to call. Those URLs might be different depending on the environments defined for the application – or the application consuming the component. Hmmm… that means that my deployment needs to know about my component to set up the URL for the component. Or my component needs to know about my deployment environments in order to select the right URL. That really doesn’t feel great. Knowledge of my different deployment environments are bleeding into my simple component. Yes, I could possibly minimise it by using a convention, but it is feeling a lot less clean now. I’m not a fan of this anymore.

What if my application tells the component what the URL is?
My application deployment already has to deal with setting different values depending on the environment. I think I’d rather extend those existing working mechanisms rather than start Yet Another Way of handling different environments and coupling my component to the deployment mechanisms of each application that consumes it. And I don’t want to couple every component to all the consuming applications deployment scenarios. That wouldn’t feel great.

Hmmm… so I feel like the application managing the URL endpoint that the component calls is cleaner for now.

But I did like the ease of use of the original suggestion. I wonder if I could still achieve that.

Another design question

How does the component get to know the URL?

set_url
One option is something like:

 Component.set_url( ‘www.url.org’ );
 the_things = Component.fetch_all_the_things

This looks nice and simple. But, what is to stop the consumer from not calling set_url? What do we do then? And also – could we get back to the original simple scenario of just doing

 the_things = Component.fetch_all_the_things

Ah – but we could make it a static URL and call set_url on application start up. That way we don’t have to worry about not calling it as it will have been called at initialisation.

But what happens if we do get into our code with the URL not set? What should happen then?

Temporal Coupling
We have some implicit temporal coupling between the setting of the URL and the calling of the component. I’ve made it even worse by “fixing” it by moving the set_url call into the application start up in order to achieve my simple component call.

I’m not sure I like the temporal coupling. I could live with it. But surely there is a better way? One in which I am less likely to get into trouble with down the road?

Pass the configuration information always

Another idea to solve the temporal coupling is to do something like

 the_things = Component.fetch_all_the_things( ‘www.url.org’ )

Or

 client = Component.new( ‘www.url.org’ )
 the_things = client.fetch_all_the_things

Both of these solve the problem. But now I need pass the URL every time I want to do my simple component call that I’ve been chasing. But I have removed the temporal coupling problem. And replaced it with a configuration problem as whenever I use the component I need to be able to get the URL and pass it to the component. Hmmm… is there a way to make this simple for the client.

I feel like this is a better solution. If I use the second option above

 client = Component.new( ‘www.url.org’ )
 the_things = client.fetch_all_the_things

I have a creation problem – how can I make the creation of the client object simple so that my callers could not need to worry about URLs?

What about a Factory?

What about a factory class or method that returns Component.new( ‘www.url.org’ ). All creational logic to select the correct URL for production or staging can be handled here. It could also cache the creation of the component in a singleton if needed.

Then our caller becomes

 client = ComponentFactory.GetComponent
 the_things = client.fetch_all_the_things

This is still 2 lines of code. But it is 2 lines of code that expresses the intent pretty well and doesn’t require the caller to concern itself with the URL at all. And the factory that I would need to write for my application would also be very simple. It would need to know how to find the correct URL and then create the component object created with the URL. I am pretty happy with this. It feels simple for the callers. And I can keep the URL to point to in the application. And the component can simply talk to that URL keeping it simple.

Could we do better?
Could we make it one line? We could maybe use the proxy pattern and simply hide all the interactions completely. But then we may be doing all the work that the component was supposed to be doing for us in the application as well as we’d need a proxy for each wrapped API call. That feels like I’d create even more application code in order to make it simpler for the application to call the API. But by writing that application code I’ve increase the complexity of the application.

Conclusion
This isn’t the only implementation, but based on the requirements it feels like a reasonable one. If for instance there could only ever be 1 end point that we talked to for a third party connection then maybe encapsulating that in the component could be a good idea. But then testability might continue to drive the design to want to be in a factory inside the component instead of simply embedded in the code somewhere next the the API call that calls it.

I feel like this is a reasonable flow of thoughts leading to a simple, thought out solution. Hopefully others find it useful. And if you have other values / principles that come to bear that would have lead to another solution – let me know – I’d love to hear them.

Bitten by Simple

I’ve recently done a piece of work around OAuth and integration with DotNetOpenAuth.  I wanted to build up the system doing the simplest thing at all times with the hopes of achieving the leanest, meanest code that I could for the implementation.  This worked well overall.

But… when building up a certain repository call to get a user – it was only being used by one controller (and still is) so I did the simplest thing and only built what that controller needed.  Life was good.

Then I changed my mind.  And I assumed that because my tests passed that my code was all working.  So I started using another user field that wasn’t being populated as it needed an additional DB call, and wasn’t needed initially.  And I got an unexpected bug.

But I do TDD.  My tests passed.  All should be awesome.  I have the green light.  I can deploy.

Wrong. Hmmm…

Doing the simplest thing I violated the principle of least surprise. I was surprised to not get a fully formed user from the user repository.  That was wrong.  Yes, the tests very clearly show me not testing the specific field and adding the test showed it wasn’t being set.  But the repository should always be returning a fully formed user object always. Yes, I should have remembered, but I’ve got a pretty good memory and I still assumed it was all working.

So what should I have done?

a)      Not violated the principle of least surprise!!
b)      Actually checked what the tests did – they are the specification
c)       Test higher up at the controller level where possible.
d)      Introduce integration tests

The problem with integrating to DotNetOpenAuth is that the controllers call it and then it calls us.  And there is very little code in the controller that we control in order to make it all happen.  But there is some.  In this case the combination of the login page and the resource fetch (after the OAuth interaction) broke.  But setting up a controller to have the correct stuff on it in order to give DotNetOpenAuth would involve me going very deep into understanding DotNetOpenAuth and what it needs.  I don’t actually wanting to test DotNetOpenAuth.  I do want to test my code around it.  As a result the controllers are being kept as simple as possible, but there is no automated testing around them.  And now I’ve been provably bitten for not having those.

And maybe I should actually want to test the interaction with DotNetOpenAuth – at least at the basic level.  An automated simple happy flow integration test would be useful here.  I’ve already got a manual test, but of course I got lazy.

So in future I plan to:
– Not violate the principle of least surprise… it isn’t worth it even if it is less code / simpler.
– Dig deeper into the synergy between unit tests – testing the code I’m writing – and integration tests – testing the full stack integration particularly the parts that are harder to validate otherwise.

And of course, continue to write the simplest code I can and retrospect when I get bitten next.

A simple failure

Do the simplest thing.
I’ve also heard: Do the simplest thing that you can refactor later.

In TDD, you attempt to do the simplest thing to allow the real design emerge and to not add unnecessary code that is not needed.  This tends to reduce the complexity of the code.  There is plenty of writing about this – such as http://www.extremeprogramming.org/rules/simple.html and http://martinfowler.com/articles/designDead.html#TheValueOfSimplicity.

More than a year ago I did a new piece of work.  The team completed it and I went back to look at the code and saw that I could simplify it in several ways.  One of those ways was to remove an unused parameter as it was always the same id for all the objects in the list returned.  We did the work and life was good.  I felt that we were achieving simple and changeable code.  I felt good.

A month or so later we needed to change the way we fetched these objects to now be fetched so that the single id was now actually varying.  A developer attempted this task for two days and we gave up doing the work as we had a simpler solution that the client was happy enough with.

I felt like the whole point of developing with tests and being safe was to make the code easier to change and here I failed.  Miserably.

So I retrospected…
Perhaps the removed complexity was important all along – as very rapidly it was needed.
Perhaps the code base isn’t as intuitive as it could be.
Perhaps the dev wasn’t good enough to see how to change the code easily.

All of these may be true.  In particular I observed two things:

  1. Misinterpreting the need: The usage that we had from the UI that we could see was simple – needing only the objects for 1 id at a time.  The common usage of related lists was often with varying ids.  I had assumed that was a complexity added unnecessarily, but in retrospect it actually was the needed requirement when analysing what the callers were really doing at a higher level.  The problem being that the callers aren’t doing that and the code is highly convoluted and hides that – but they should be.  Since then I have also change the way I look at building up things like this – in order to make things simple – and this would also have led me to the correct need of loading with varying ids initially.
  2. Expectation to change: I knew that the code was likely to change in this direction and I actively removed it in a pursuit of simpler code.  Perhaps looking ahead a miniscule bit is valuable to support the open/close principle in the short term.  If I had left the complexity it would already have been able to deal with this change.   I’m not 100% comfortable with this thinking.  In Agile Principles, Patterns and Practices in C#, Uncle Bob suggests only making such a change when the code changes in that way.  Guided by that principle, it shouldn’t have been there.  But in this case it was there already (it wasn’t test driven initially… this part was inherited.)  I do suspect misinterpreting the need is more the root cause, though this is worth pondering and entertaining and experimenting with the tension that gets introduced.

So maybe it is useful to think of: Do the simplest thing that you can refactor later while keeping in mind how the current design is already telling you it needs to be open for extension?

But more likely, simple is hard, and you need to dig deeper to ensure you have the depth of the real need the code is fulfilling.