I was in a meeting the other day deciding what to do next in our
testing efforts. Several times during the meeting someone made a
suggestion which was countered by a statement something like this:
"Shouldn't we let the centralized team handle this?" or "Could we write
this in a more generic fashion so everyone can benefit from it?" In
general these are good questions to ask. Every time though something
inside me reacted negatively. I didn't want to go the centralized
route. This post is an examination of why I felt that way and why,
perhaps, it is not always a good idea to do things centrally. Why
instead it might be a good idea to duplicate effort.
The sort of issues we were talking about revolve around test harnesses and test systems.
For instance, we might want some new way of generating reports from our
shared test result database or some new way of monitoring performance
issues. Each of these can be solved in a specific way for a problem
area or can be solved more generally and serve many problem areas.
Doesn't it always make sense to solve it more generally?
It
sure seems so on face. After all, if Team A and Team B both have
similar problems and both put N units of effort into a solution, we've
spent 2N units of manpower when we might have only spent N and used the
solution 2 times. That other N units of manpower could be better spent
working on a solution to a different problem. Is it really that simple?
As
programmers we are trained from early on not re-write code. If we need
to do something two times, we are trained to write it once and then
call it from both places. When it comes to code, this is good advice.
However, when it comes to tools, the same rules may not apply. Why
not? Because when it comes to tools, the jobs you are trying to do are
more often than not, non-identical. Even if 80% of the jobs are the
same, that last 20% of each makes a big difference.
Writing a
tool to do the same job twice is always better than writing 2 different
tools. But what if the jobs are dissimilar? In that case, the answer
is not necessarily clear. Instead we need to weigh the issue carefully
before making a decision. The advantages of doing things once are
obvious: it saves us from having to duplicate effort. There are,
however, disadvantages that many times overwhelm that advantage. That
is, the cost of doing things one time is higher than the cost of doing
it twice. Why might that be? There are three things which cut against
the time savings of doing things one time. First is increased
complexity of the task. Second is the decreased quality of the mapping
between the problem and the solution. Third is the lack of
responsiveness that often accompanies a shared project.
Solving
two identical problems with one solution is easy. It takes N time.
However, solving two similar but not identical problems takes greater
than N time. That's easy enough to accept but it should still take less
than 2N time. Right? Assume two problems are 80% similar. That means
they are 20% different. It might seem that it might take 1.2N time to
solve both problems. Unfortunately, that isn't the case. Making one
framework do two things is harder than just doing two things
independently. I would argue it is up to M^2 as hard (where M is that
delta of 20%). Thus instead of taking 200 one percent units of time to
do the duplicate work, it might take 480*. Why this polynomial
explosion in time? Because trying to get one interface to do one thing
is easy but trying to get it to do two is much harder. You have to take
into account all of the tradeoffs, mismatches, and variations required
to handle each situation. If you have M functions (or variables) and
each needs to take into acount M other functions, you have M*M or M^2
possibilities to deal with. For instance, if we are calculating a FFT
to compare audio samples and one problem expects the values to be in
float32 and the other in int32, trying to make one interface that
handles both situations is very hard. The same thing happens when you
try to create a test system that handles everyone's problems. It
becomes inordinately complex. Writing it becomes a painstaking process
and takes longer than solving the problem twice.
There is a
saying "jack of all trades, master of none." This applies to software
just as it applies to professions. Just as the best carpenter is
probably not also the best plumber, the best solution to problem A is
probably not the best solution to problem B. There are always tradeoffs
to be made in trying to solve two problems. Something inevitably
becomes harder to solve. I have to do more initialization or I have to
transform my data into a universal format or I have to fill in fields
that make no sense for my cases. Most of these are small annoyances but
they add up quickly. In Windows we have a unified test system for
almost all of our testing. During the Vista timeframe, this was brought
online and everyone moved to it. Changing systems is always painful
but even after the cost of switching was paid, many (most?) people still
felt that the new solution was inferior to whatever they had been using
before. Was the new system that bad? Yes and no. It wasn't a
terrible system but the incremental cost of a less specialized tool
added up. Benji Smith has an essay
I think is applicable here. He is talking about web frameworks but
uses the concept of hammers. There are many different kinds of hammers
depending on whether you want to tap in a brad, pull a nail, pound a
roofing nail, or hammer a chisel. A unified hammer doesn't make sense
and neither does a framework for creating specialized hammers.
Sometimes having 3 hammers in your drawer is the right answer.
Finally,
there is the problem of responsiveness. Shared solutions are either
written primarily by one team and used by a second or they are written
by a specialized central team and used by everyone else. Both of these
have problems. If my team is writing a tool for my team's problem,
we're going to get it right. If we need certain functionality in the
tool, it will be there. However, if someone else is writing the tool,
we may not be so fortunate. One of two things is likely to happen.
Either the other team will have different priorities and so our feature
will be written but at a time not to our liking or worse, our feature
will be cut. It could be cut due to time pressures or because it was
decided that it conflicted with other scenarios and both couldn't be
solved simultaneously. Either way, my team now has a big problem. We
can work around the tool which is always more expensive than writing it
in the first place. Or we can forego the use of the feature which means
our functionality is limited.
When one adds up the explosion in
complexity, the lack of clean mapping between the problem space and the
solution, and the inevitable lack of responsiveness, it can be much more
expensive to go with the "cheaper" solution. Here is a rule of thumb:
if you set out to solve everyone's problem, you end up solving no one's
well.
Now that I think about it, these same principles apply to
code. If you are doing the exact same thing, factor it out. If you are
doing similar things, think about it carefully. Writing generic code
is much harder than writing specific code in two places. Much harder. A
while ago I read an essay (which I can't find now) about writing
frameworks. Basically it said that you shouldn't. Write a specific
solution. Then write another specific solution. Only when you have a
really good sense of the problem spaces can you even hope to produce a
decent framework. The same goes for testing systems or other general
tools. Specialization can be good.
All too often we make a
simplistic calculation that writing something twice is wasteful and
don't bother to think about the extended costs of that decision. Many
times writing one solution is still the correct answer but many other
times it is now. In the zeal to save effort, let us be careful we don't
inadvertantly increase it. The law of unintended consequences operates in software at least as well as it does in public policy.
http://blogs.msdn.com/b/steverowe/archive/2007/01/14/duplication-of-effort-is-good.aspx
No hay comentarios.:
Publicar un comentario