Monday, January 17, 2011

Lean Development : Eliminate Waste

This post has taken longer than I'd hoped, and is consequently less polished than I'd hoped... but here goes...

The first entry in our list of Lean Development concepts shows leans focus on efficiency:
Eliminate Waste - No Extra Features, Churn or Boundaries
Before we can eliminate waste we must first understand what waste is. I think of this concept as "Don't do what you don't need to do" and waste is any resource use that is not driving your core business.

Here is a list of 7 deadly wastes - which I believe come from the original Toyota lean methodology.

  1. Transportation
  2. Inventory
  3. Motion
  4. Waiting
  5. Over-Processing
  6. Over-Production
  7. Defects and Rework
In most of these cases the resource that is being wasted is time. 
I'm going to cover the first 3 today, and more tomorrow.

Transporting a commodity from A to B takes time and may have a monetary cost. But in a digital world this is less applicable. However if you think of this as meaning inefficiencies in your supply chain this may make more sense. How long does it take you to get a "ready" version of the code out to your customers? How painful is this process? How robust?  What are you doing in the process that can be automated or removed? Any manual interaction that is not needed is wasted effort. Any unneeded delays are wasted time. 

Inventory in a software sense is completed features not shipped to the users. A feature is not providing value to a customer when it is sitting in your development version of the software. The sooner you can get a completed (and tested!) feature out to you users, the sooner it provides value to them, and thus dirves value for you. Of course if every push to the users is taking a couple of days of effort then you have a transport issue that needs fixing first.


Motion, to me means one developer (or task) doing unnecessary actions as part of its progress through the conceptual pipe from idea to implementation. These often crop up as bureaucracy - paper work that will be discarded, double entry into multiple bug tracking systems, emails to supervisors, stuff that makes a developer busy but not productive. This is wasted effort and time.

Waiting.. this usually means waiting for feedback from another party, or waiting for compilation, or waiting for  tests to run. The developer ends up doing nothing productive, or is less-productive due to context switching between tasks. So parallelize or speed up the process. Tests should be near instantaneous - your full test suite takes an hour to run - split it up across 10 machines and it should take 6 minutes. Compilation should be fast (and incremental) - use ccache, distcc or something like that. Feedback should be fast or at least predictable.

Well thats it for today, as I said its a bit choppy butt hopefully you got something out of it.

Sunday, January 16, 2011

Lean Development

I've been looking for something I could write a series of posts about .. and I stumbled across Tom Poppendieck's business card. An odd thing to blog about you might think, but on the back of the card are 7 core ideas of Lean software development. I'm just going to repeat them here, then elaborate on what they mean to me. Lean is a style of development that I've been trying to move towards it focuses heavily on efficiency, blending ideas from Agile Development with ideas learn from large scale manufacturing from Japan. An odd combination, but it feels right to me.

So here's the list verbatim from Tom's card:

1. Eliminate Waste - No Extra Features, Churn or Boundaries
2. Build Quality In - Mistake-Proof with TDD; Write No New Legacy; Continuous Integration
3. Focus on Learning - Scientific Method; Challenge Standards; Feedback; Continuous Improvement
4. Defer Commitment - Break Dependencies; Maintain Options; Irreversible Decisions at Last Responsible Moment
5. Deliver Fast - Low Cost and Quality and Speed; Queuing Theory: Flow, Limit Work to Capacity
6. Respect People - Pride, Commitment, Trust and Applause; Effective Leadership; Respect Partners
7. Optimize the While - Measure Up, Avoid Sub-Optimization; While Value Stream & Whole Product

Oh and just a note, any mistakes here are purely my own, as are any opinions presented. But credit for the ideas really belongs elsewhere.

Friday, December 3, 2010

Bash named pipes

Pipes are a unix staple. They're core to the unix philosophy of modularity. Bashes named pipes just knock this up a notch.

Lets say we have two directories that should have contain the same files, but one is on a case insensitive file system, so some files might differ in case. We could do this, using normal pipes to process the directory entries.

ls dir1 | tr "[:lower:]" "[:upper:]" | sort > dir1_contents
ls dir2 | tr "[:lower:]" "[:upper:]" | sort > dir2_contents
diff dir1_contents dir2_contents

But this is kind of hacky and creates two temporary files.

Bash named pipes lets you use the output of a pipe system in place of a filename. Its almost like the pipes were written into temporary files and the files deleted afterwards, but no disk space is consumed by the files.

diff <( ls dir1 | tr "[:lower:]" "[:upper:]" | sort ) <( ls dir2 | tr "[:lower:]" "[:upper:]" | sort )

Much neater.

Mind you I haven't used these often, but if you know they exist they can make some painful multistep processes much simpler.


Monday, September 27, 2010

Abusing shared_ptr custom deleters to store data.

So you can associate a custom deleter with a boost::shared_ptr object. That deleter will get called when the last reference to the underlying object is destroyed and is responsible for cleaning up the object. You might use this to use a boost::shared_ptr to hold a mysql handle.
boost::shared_ptr mysql( mysq_init(), mysql_close );

(or something like that .. can't remember the exact function names)
But you can also "hide" objects with the shared_ptr that will get destroyed with it... I'm not sure quite why you might want to do it, but someone was asking how to do somethign similar in StackOverflow.

Here's how you do it.


Friday, September 24, 2010

Graphing include dependencies

Sometimes it's nice to get a higher level view of what files are including which other files.
This becomes tricky as your projects get larger .. So I wrote a little python script to help me visualise it...
You can find the script here

It takes a list of files to analyse on stdin and produces a dot graph file that you can render using the graphviz tools (I find dot and neato to be the most useful for this task)

Heres an example of what I get when I run it over a filtered subset of one of my projects:
(Original is a PDF so its easy to zoom in to see filenames, but you only get a PNG to protect the innocent)


Thursday, September 9, 2010

Fixing whitespace issues with git rebase

We're using a rebase and merge style approach for getting topic branches into our master branch.

Typical usage looks like this:
# Get the topic branch
git checkout topicbranch
# Sit it ontop of master
git rebase master
# Clean up all the commits so its nice and tidy
git rebase -i master
git checkout master
git merge topicbranch
Now if you add a --whitespace=fix to the first rebase stage, git will clean up all the odd whitespace at end of line stuff that git complains about and that editors can leave in there accidentally.

However this only works if your branch is not already branched off master. If your branch already contains the master head you'll just get the message:
Current branch topicbranch is up to date.

It turns out its easy to make the rebase happen:
git rebase -f master --whitespace=fix

Friday, June 25, 2010

Smashing the Stack: Storing per class information

This stack overflow question demonstrated some issues with storing per class metadata in templated info classes. In particular it boils down to something like this:

Original Code


template<typename T>
struct MetaData
{
  static std::string data;
};
With thre requirement that you add something like this to a source file for every class you require metadata for. Failing to do this will generate a link time error.
std::string MetaData<int>::data;
std::string MetaData<MyClass>::data;
The original code using this kind of method would look something like this:
template <class T>
static void OpenLib(lua_State* L)
{
  if (MetaData<T>::myTable && MetaData<T>::myTableName)
  {
    luaL_openlib(L, MetaData<T>::myTableName, MetaData<T>::myTable, 0);
  }
}

Template sepcialisation

Personally I don't like link time errors and prefer compile time errors, so we could try using template specialisation to get a similar result.
template<typename T>
struct MetaData
{
};
template<>
struct MetaData<int>
{   static std::string data() { return "foo"; }
}
template<>
struct MetaData<int>
{   static std::string data() { return "bar"; }
}
This time you'll get a compile time failure if you fail to define the MetaData for a class that you are using. However there's still a lot of boilerplate code, which I don't like. Secondly the values can't be changed at runtime, which was a requirement of the original post. The code would then look something like this:
template <class T>
static void OpenLib(lua_State* L)
{
  if (MetaData<T>::myTable() && MetaData<T>::myTableName())
  {
    luaL_openlib(L, MetaData<T>::myTableName(), MetaData<T>::myTable(), 0);
  }
}

Singleton

Another trick we can use to avoid needing to define static variables in a .cpp file is to use metadata that uses singleton functions for each variable. Then the function static variable is created inside the function, (on first function use I think) and lives until the end of the application.
template<typename T>
struct MetaData
{
  static std::string& data()
  {
    static std::string data_("");
    return data_;
  }
};
As we return a non-const reference we can still change the value at run-time, like this:
MetaData<int>::data() = "Omg Froobles";
However some people, (myself included) dont really like the use of singletons (or global variables) as they can make code quite hard to test. The code would then look something like this:
template <class T>
static void OpenLib(lua_State* L)
{
  if (MetaData<T>::myTable() && MetaData<T>::myTableName())
  {
    luaL_openlib(L, MetaData<T>::myTableName(), MetaData<T>::myTable(), 0);
  }
}

(Almost) Template free type map.

There's an alternative to using templates as a way into the C++ type system, and that is type_info (Well there's also polymorphism, but we'll ignore that option for now). If we go down that route we can reduce the usage of templates and global/singleton variables significantly.
struct ClassMetaData
{
  std::string data;
};

struct TypeInfoCompare
{
  bool operator()(const std::type_info* i1, const std::type_info* i2) const
  {
    return i1->before(*i2);
  }
};

std::map<const std::type_info*, ClassMetaData, TypeInfoCompare> metaDataMap;
Using this design I'd probably produce code that looks like this:
typedef std::map<const std::type_info*, ClassMetaData, TypeInfoCompare> MetaDataMap;
template<typename T>
static void OpenLib( const MetaDataMap & metadata, lua_State* L)
{
  return OpenLibCore( metadata[ &typeinfo(T) ] );
}
static void OpenLibCore( ClassMetaData & m, lua_State *L )
{
  if ( m.myTable && m.myTableName )
  {
    luaL_openlib(L, m.myTableName, m.myTable, 0);
  }
}
And while you now need to pass in an additional parameter to the OpenLib function, these functions are now easily testable. And if you really hate that additional member you can make the MetaDataMap a global variable or singleton.