Discoverability – a design choice

I recently blogged on discoverability being a naming choice. I talked about how the choice of name may make changing it later easier or harder.  What would happen if we start using the qualities of dynamic ruby to do some meta programming – how would that influence a future developer’s capability to discover how the code works?  How does this break the model of “Find in Files” discussed in that post.

Let’s start by making it worse

Imagine an Active Record model, Tour, with a column full_price_for_tour in the DB.  In a vanilla rails project, searching for full_price_for_tour in the codebase may result in no hits at all.  Equivalently, with looking at a caller that calls full_price_for_tour on an instance of Tour, we will not find any reference to full_price_for_tour in the class file for Tour.  For new Rails developers this can be very confusing.

The programming model that the active record implementation is helping us with is potentially a useful one – dynamically determining the methods from the database and creating them on the object.  But it is harming discovery of how the code works.

So how do we help developers discover where the code is?

In a Rails codebase the annotate gem comes to the rescue.  It annotates the model classes based on what is in the DB for the matching table.  This allows a developer to discover the list of dynamically created methods that they can call on the model object – and hence what data the model object does expose.  This is a Good Thing.

Searching for full_price_for_tour will have a match in the Tour class file – as a comment in the annotations.  The developer now knows this method is the column in the DB as the annotations are allowing that discovery.

And then someone gets clever

The Active Record implementation leverages the dynamic qualities of Ruby to do something useful for developers.  All dynamic meta programming may not always be beneficial.  There are always trade-offs in software design.

Some production ruby code I saw recently implemented something along the lines of:

  def self.method_missing(method_sym, *arguments, &block)
    @hash_obj[method_sym]
  end

This code was written to provide helpers to access a hash’s properties with method calls.  This was a convenience method.  And there were some other helper methods defined in this class in order to work out some standard answers to questions that the hash provided.  On the face of it, this looks like a clever use of ruby as a dynamic language.

But how does another developer discover all the methods that this class responds to?  We need to find the hash definition to discover that.  In this case, the hash was from some json from a HTTP POST.  The simple question of what can we expect this object to answer to was not codified in the code at all leaving all developers on the team very unsure what the correct method names were.

Going back to the refactoring example.  When, assuming this was the Tour class and we were calling full_price_for_tour on this object – how would we find out what the implementation was?  First we’d fail to discover one with a “Find in files” type search.  Then we would have start spending some time working out why there wasn’t one and what the magic was that made it work.  As a developer this is time wasted.  Even worse, when the question is “what is the full interface”.

Another clever thing

Some ruby code I can easily imagine is:

  def self.method_missing(method_sym, *arguments, &block)
    method_as_string = method_sym.to_s
    split_methods = method_as_string.split(‘_’)
    obj_to_call = split_methods.shift
    method_to_call = split_methods.join(‘_’)
    obj = self.public_send(object_to_call)
    obj.public_send(method_to_call)
  end

Assuming this code is in the Tour class, we now can call fullprice_for_tour* on the Tour class.  This will then get the fullprice object inside this instance and call the for_tour method on it.

tour.fullprice_for_tour would be the same as tour.fullprice.for_tour.

* I’ve changed the method name to fullprice in order to make the code example simpler.

This kind of code is clever.  But it stymies discoverability again.  When I search for the method fullprice_for_tour I will be unable to find any definition of it anywhere.  I now need to investigate the Tour class file in order to determine that there is a method_missing handler, and work out that actually we are calling fullprice on that class and for_tour on the FullPrice class.  Now I can find the code.

The simple model of searching for the implementation is broken by this coding style.  Searching now becomes an iterative process when nothing comes up.  Which takes longer.

And then there are Rails delegates

In Rails you can add to the Tour class

  delegate :for_tour, to: :full_price

which enables

  tour.for_tour

to be the same as tour.full_price.for_tour

You can even add prefixes

  delegate :for_tour, to: :full_price, prefix: :full_price

which now enables

  tour.full_price_for_tour

to call tour.full_price.for_tour

This saves a developer from writing

  def full_price_for_tour
    full_price.for_tour
  end

in the Tour class.

We save writing a method definition.  But the discoverability is hurt – particularly when the prefix is used.  We now have to do multiple different types of searches in order to discover where full_price_for_tour is defined.  And we need to remember to do that.  And as determined, there could be multiple different ways in which the method could be defined dynamically.

A Hypothesis

The cost of discovery should be at least N times lower than the cost to write the code.  Where N is the total number of times the code is to be viewed and understood up until it is deleted.

I would hypothesise that the benefit of not having to write a trivial method definition makes the discoverability of the method take at least twice as long.  In general my first guess will be wrong.  I have to guess at least once more – looking for delegates.  But then again, it might be another dynamic way, so I might need to keep on guessing.

The design choice of coding this way results in a codebase that on average takes longer to discover things in.  Which means over time, software will take longer and longer to be delivered.  As compared to the constant cost at the time of typing a little more.

The constant cost of typing the method definition occurs once – when the developer writes it.  The cost of discovery occurs every time a developer needs to understand where the code is defined or from where it is called.

So is dynamic meta programming ever justified?

For the majority of developers, the core type of work is business applications that mostly do CRUD operations.  Use cases are driven by actual requirements.  Actual requirements are concretely defined.  They should have concrete tests that define them.  Using dynamic meta programming is almost never required.

Sometimes the code is doing the same thing behind those concrete interfaces.  The code may want to be refactored to take advantage of dynamic techniques to reduce duplication and expose a new level of abstraction.  This can be valuable when things are in fact the same.  If the abstraction makes the system easier to change, this is good.  But these changes should be done beneath the concrete definitions of what the system does.  The system is a set of concrete interfaces and use cases that have concrete tests.  That is what allows us to refactor to a more abstract design below the covers.  As the underlying code becomes more abstract, the external interface and the tests calling the interface remain specific.  The abstraction should not be the exposed interface of your average business application.  The abstraction should not make it harder to discover how the system works.

Observations

Many developers value locally optimizing the time that they save writing code.   At the same time, they ignore the amount of time they cause someone else to waste when attempting to work out the implementation at a later date. Most code is write once, read many.  Optimising for discoverability and understanding is more useful on your average business application than optimizing for the speed at which you can take on the next story.  Optimising for speed to the next story now will result in slowing down later due to spending time discovering how to change the code in order to implement the new story.

I value discoverability.  Having worked on many large code bases – finding stuff needs to be as easy as possible.  I understand others may value terseness more.  Design is always a trade-off.  Understanding what is being traded-off is important.  I don’t consider using meta programming, to reduce lines of code that I need to write, more important than being able to discover and understand that code quickly and reliably later.

If your team uses code like the Rails delegate everywhere in their code – then everyone already knows that all searches to discover a method’s usage or implementation should take that into account. Everyone will be doing it and perhaps that is fine – despite increasing the complexity of that search.  The importance here is consistency and providing an element of least surprise.

If a codebase sometimes uses magic – method_missing, delegates, etc – and sometimes does not, then it becomes more of a guessing game when to search for them and when not to.  That is a real cost to maintaining a codebase.

If I haven’t found the code in my search – is that because it isn’t there or is it because it is using some other magical way in order to be declared?

Don’t use dynamic meta programming unless it is really useful.  Most things are cleaner, clearer and easier to change without meta programming.

If you’re breaking the paradigm, use something else to mitigate the loss.  In the case of Active Record, using the annotate gem to help discoverability mitigates the dynamic implementation that makes discovery of the methods harder.

Think!

Think about discoverability.  Think about the cost of making discoverability harder.  Is there something that can be done to mitigate the cost of this design choice?

All design choices are choices.  Weigh up the pros and cons for yourself and with your team.  Discoverability is just one facet of a good code design, but all too often it isn’t even a factor in the thought process.

Advertisements

Discoverability – A naming choice

When reading or refactoring code, it is important to be able to easily find all callers of a method or to go to the implementation of a method.  In static languages, IDEs make this reasonably easy.  However, if reflection is being used, they may still fail us.  It is simply easier to parse a static language to know the types involved and find the right ones to tell us about.  In dynamic languages that is much harder.

In a dynamic language, a key way to find all references to an object’s method call is “Find in Files”.  This means what we choose to name things may make it harder or easier to change later.  It may also make it harder to discover who is calling the method – or even that the method exists.

A unique name

In order to refactor a uniquely named method on a class

  • search for the method name as a string
  • rename

As we know it is unique, this will work.  In fact, you might be able to run a simple find and replace in files instead of looking at each result individually.

This scenario however is unlikely.  At least it is unlikely that we emphatically know that a given method name is unique.

A more likely scenario

In order to refactor a descriptively named method such as full_price_for_tour on a Tour class

  • search for the method name as a string
  • in each search result – check all references to the method name to see if they are in fact using a Tour object
  • if this is a Tour object call, rename the method call to the new name.

This is more work as we need to look at each result.  Hopefully with a descriptively named method the number of usages will not be too high.  Even if the number of usages is high, hopefully all usages of the name will in fact be on the Tour class.

However, we do need to look at each result as this process is potentially error prone.  There could be other method definitions using the same name that we need to NOT rename.  Hopefully there are tests that will tell us if we fail.  And hopefully the number of callers to change isn’t too high due to the descriptiveness of the method so the changes to the callers is clear.

Sometimes the results are less simple

Now imagine repeating the above exercise, but now the name of the method to refactor is name.  Suddenly we may have a huge number of hits with many classes exposing a name method for their instances.  Now the ratio of search result hits that are to be updated is no longer almost 100%.  The probability of error is much higher – the greater the number of hits, the more actual choices that need to be made.

An IDE may help

Immediately the IDE lovers will point out that, using an IDE is the solution.  And yes, it could help.  But IDEs for dynamic languages are generally slow and CPU/memory intensive as the problem is a hard one to solve.  And they won’t always be correct.  So you will still need to employ strategies using a human mind.

Naming things more explicitly can help

A more useful model – even if you’re using an IDE – is to name things descriptively, without being silly.  Things like tour_name and operator_name instead of name may help someone discover where / how a method is being used more easily.

Designing code to only expose a given interface can help

Building cohesive units of code that only interact through a well defined interface makes changing behind the interface a lot easier.  However it still doesn’t discount developers reaching in behind the curtain and using internals that they should not.  So you will still need to check.  Hopefully code that breaks the design like this will be caught before it gets merged into the mainline, but you never truly know without looking.

Reducing scope where possible can help

Knowing the scope of access of the thing you need to change can make changing it easier as it reduces the area you need to look in.  For example, if something is a private method then we know that as long as all usages in this class are updated, then we are completely free to change it. Unless someone has done a private_send from somewhere else…  Or we are mixing in a module that uses the private method from there… Both of which I’d like to think no one would be that silly to do.

Testing can help

Obviously having a comprehensive test suite that calls all implemented permutations of the usage of the method will help to validate a change.  It will hopefully help us discover when we have missed updating callers.  Or when we’ve accidentally changed a caller that shouldn’t be changed.  However if there are name clashes for the new name, it is plausible that it might not give you the feedback that we expect so it isn’t a silver bullet if you aren’t naming things well.

Think! 

Think about naming.  Think about discoverability.  Is there something that will make changing this easier in the future?

Think about the cost of making discoverability harder.  Be aware of the implications of a naming choice.  Is there something that can be done to make it easier to safely refactor away from this choice later?

Can we make things worse? Discoverability is a design choice to make easier or harder.

Discovery – a HTML and JavaScript example

I value ease of understanding in code. I find it helps me to develop more maintainable software that I am confident to change. One of the things that I attempt to optimise is how easy it is to discover how the code works. I ran into an example of this is in a code review recently that I thought was worth sharing. Below is a modified example of what I was reviewing.

The example

Assume the following HTML.

  
  <textarea name=’area1’ id=’area1’ class=’area-class’></textarea>
  <textarea name=’area2’ id=’area2’ class=’area-class’></textarea>
  <textarea name=’area3’ id=’area3’ class=’area-class’></textarea>
  

Along with the following JavaScript in a .js file that is loaded with the HTML page.

  
  function update_counter() {
    // some code to update the element
  }

  $(“#area1”).change( update_counter );
  $(“#area2”).change( update_counter );
  $(“#area3”).change( update_counter );
  

How does the next developer know that there is a counter being hooked up? How does the next developer know how it is being hooked up? How easily could a developer look at the HTML and successfully add a new textarea with the same capabilities?

Based on the above HTML alone, there is no clue that there is a JavaScript hook into it. There may be a JavaScript include in the HTML page that will lead us to the file that is doing the work, but that isn’t something that will be looked at unless there is a reason to look and the HTML isn’t giving a reason to look.

Knowing which file to look at could be even less obvious if you’re using something like the Rails asset pipeline that precompiles and bundles files together as there is unlikely to be a single include for the .js file.

If we knew we were looking for some JavaScript, another discoverability mechanism would be to search for any instance of the textarea’s id or class being used in JavaScript. This could be a little painful.

A simple fix

In this case, a simple fix is to move the initialiser that hooks the update_counter to the specific DOM elements to be coded on the HTML page itself. This highlights to a developer what DOM elements are being bound to JavaScript in the same file as the elements are being defined. This provides a breadcrumb for the developer to follow in order to discover how things are hooked up.

When I have used KnockOut.js in the past, I have had the binding action run on the HTML page to bind the HTML to the relevant JavaScript model to help allow discovery of what JavaScript code to look for.

What if it isn’t that simple

What if the code that is being run to do the binding is a lot more complex?

Another way to do this could be with data- attributes.

  
  <textarea data-counter name=’area1’ id=’area1’ class=’area-class’></textarea>
  <textarea data-counter name=’area2’ id=’area2’ class=’area-class’></textarea>
  <textarea data-counter name=’area3’ id=’area3’ class=’area-class’></textarea>
  

and the Javascript binding becomes

  
  $(“[data-counter]”).change( update_counter );
  

This code now allows the developer to look at the mark up and ask the question “What does the data-counter attribute do?” This will lead them to the fact that there is JavaScript binding to the element.

Even if the developer does not take any notice of the data-counter attribute, copying a row and updating the id to a unique area id and name will still work the same as all the other textareas without the developer needing to think.

The term “Falling into the Pit of Success” is sometimes used to represent things that developers will do by default that will in fact be the correct decision. This is an example of that.

The additional benefit is that the data-counter could be used on multiple pages across the site and it will work the same.

The actual instance

In this case, the counter is visually obvious, so there are some clues for the developer to try to look for the JavaScript counter and how it ties in. However the actual instance that I was code reviewing was saving something to local storage per textarea, and then putting it back, in order to implement a feature. This was even more opaque as it was less obvious it was happening or why.

Value discoverability

Always think about how the next person will discover how the code you are writing now fits together. It might even be you in 6 months time. How quickly can it be worked out? Is it explicit and obvious and easy to do the right thing? How will they discover what you did?

The faster it is to discover the way the code works, the less time will be wasted trying to find it out how it works and the more effective you and your team can be.

Personal Responsibility

Last week I was able to attend Agile Africa 2015. The most inspiring talk, amongst the many great talks and conversations, was Kent Beck’s closing keynote. He talked about the engineering culture at Facebook and its key overriding value of taking personal responsibility.

At Facebook as Kent Beck described it
The task you pick up is your responsibility to get right. All parts of it. Even when it is the first task you do when joining. There is no one else other than the developer who is responsible for the decisions required to get it done. If you deploy a bug, someone else may see it and will fix it. And they will give you feedback. Everyone is responsible.

When joining Facebook, developers join Facebook Engineering. Developers are not hired into a specific team at Facebook. In the first weeks the new hire chooses which team they will join. They are personally responsible for the choice to ensure that they join the right team and make the most impact in the organisation.

This is a culture that trusts you to know what you should be doing. If you get it wrong, you will receive feedback.

A culture
A culture of high focus on impact guides developers to prioritise their time effectively.

A culture of clear, direct and immediate feedback makes sure that there is a feedback loop on responsibility.

A culture of P50 goals drives improvement and striving for new thinking. P50 goals are goals that you aim to hit with 50% probability. Spread them around so that all the 50% probability isn’t in one area – otherwise no goals may be achieved at all! [1]

This is a culture of high responsibility, but associated with that must be very high trust. And it has scaled to thousands of developers.

Not Agile
Facebook does not claim to be Agile. There are pockets where there are lots of tests. There are even larger pockets where there are none.

What? The inventor of XP works at a non-Agile organisation? What does that mean? Is Agile dead?

Very agile
However Kent Beck has seen FB turn on a dime. FB is highly successful. Engineers enjoy working there – at least Kent seems to enjoy it immensely. It sounds like a very agile organisation.

Personal Responsibility
The key is leveraging personal responsibility. It is what Kent says he was getting at when he wrote the XP book.

High personal responsibility leads to developers needing coping mechanisms so that they don’t mess up. If you mess up, you own the mess – 100%. If you mess up too much, you won’t last long. If someone sees your mess, they will fix it – because they too are responsible – and you will get clear, direct and immediate feedback to act on for the future.

Personal Responsibility drives behaviours. It may lead you to TDD. It may lead you to lots of acceptance testing. It may lead you to implementing lots of metrics or monitoring a lot and responding as soon as errors are detected. All of these help a developer keep a handle on their work and hence help them to be responsible. There are no ‘the system is the fault’ types of answers to failure.

Personal responsibility leads to developers having to take control of the code base they are working on. It leads to developers being more confident of their work. If they are not, they fail. They will receive clear, direct and immediate feedback. And they must take responsibility for the problem.

XP has been pointing at personal responsibility. It also seems that personal responsibility can point straight back at XP as one of the possible outcomes of dialling personal responsibility to the extreme.

Personally responsible
Personal responsibility touches at the heart of much of great software development. It feels like it should be a core value to optimise around. I certainly am going to be thinking harder on what it means in everything I do.

If you weren’t there – watch out for the videos when they are out. It was a really inspiring (and funny) talk.

If you have seen the talk, let me know what you’re thinking as it would be interesting to hear what others will do around this.

 

[1] More on P50 goals in Growing Agile’s latest newsletter at http://eepurl.com/bxmJFb.