The Robots Brain

Wednesday, May 18, 2011

Gits prepare-commit-message hook

So I often write bad commit messages. At best they're inconsistent - sometimes with ticket ids at the start sometimes with ticket ids at the end. Sometimes only a short message, sometimes a nicely formatted bullet point list.

Thankfully git has a tool to help you get these things consistent. Its called the "prepare-commit-msg hook"

I wrote a little python script to make a default commit message. Its not perfect but it should help... It takes the branch name, searches for a version tag and removes it, then searches for anything that might be a ticket id and adds it to the message. Finally it adds some boilerplate.

Command line git trees

Often I'm stuck in a terminal and want to see the commit history in a nice tree format. There's a nice solution ... git log recently learnt the "--graph" feature.

You can get pretty nice results from "--graph --oneline"

But using the tricks from http://www.jukie.net/bart/blog/pimping-out-git-log we can get a much nicer result - including author and branches and time all coloured nicely

Friday, April 29, 2011

Rendering maths from a web page

Occasionally I do pretty math intensive coding.
And one of the best pieces of software for writing about maths is TeX.
It was written by one of the gurus/fathers of modern computer science in 1978, and is still in use all over the world. So when it comes to putting math on the web, it used to be a workflow like this

render math using tex
copy and paste somewhere
add an image link to your docs.

However some clever people have made that process so much easier: Two efforts in particular look very nice

Heres some mathjax : $ \nabla \times \vec{\rho} = 0 $
Heres the same thing using the google api: $\nabla \times \vec{\rho} = 0$

I think the mathjax is nicer, but the google charts version is easier to integrate.

Friday, April 1, 2011

Using saru with googlemock and googletest

Saru (earlier blog post) is my little testing framework. It does everything we need. It's been used in several serious software development situations. But sometimes things are a little painful.

For example, saru doesn't come with any nice mocking helpers for C++. There is a basic C++ testing library that comes with it, but writing your own mocks by hand is one of those painful things I mentioned.

Thankfully there's a nice mocking library for C++ from the folks at google, called oddly enough, googlemock. However googlemock is designed to work with googletest - the google testing framework.

Googletest is also great. But its orthogonal to saru, rather than competitive. Saru is cross-language and designed to be more of a test-running wrapper, while googletest is a c++ unit testing library.

So I had three options if I wanted to use google-mock with my code and saru.

Make the google-mocks work with the saru-cxx library.
Make google-test output in a format that saru could digest.
Make saru able to parse google-test output.

IMO the third is the wisest and most extensible option. Luckily the changes were pretty easy.

So now getting a google-test file working in saru is as trivial as adding a

\\SARU : Format gtest

to the top of the test file.. and everything just works :)
(Well you'll need to make sure the compiler can find the right includes and the gtest library.. but thats all)

For example I get this kind of output when running a test suite.

99-misc-00-periodic-processor.cpp::TestFixture::test_process_many : OK
gmock_test.cpp::PartyTest.CallsDance : OK
gmock_test.cpp::PartyTest.CallsDanceFails : FAILED
==MESSAGE==

==STDERR==
gmock_test.cpp:69: Failure
Value of: p.party()
  Actual: false
Expected: true

Wednesday, March 2, 2011

Introducing the fastjson library

So using my poor mans sampler (and the awesome shark tool on OS X) we found that my application spends a lot (90%) of its time serializing and deserializing json. That's "not good".

I've had a lot of pain finding a good json library for C or C++. Some are one-way (encode or decode only). Some wont handle large numbers (uint64_t at least). Some can't handle unicode.

The boost version based on boot::property tree looked very promising at first. But it has a lot of nasty edge cases. (Unicode is broken unless you're using wchar_t, value nodes lose their type and become strings, certain edge cases wont serialize properly). We managed to work around all these issues, patching the boost libraries and putting hacks into our code. But the insurmountable problem is that it is SLOW.

Now I could look for another library and work around its idiosyncrasies, but boost is already the second library that we've tried to work around.. and none of the other libraries look promising (or have licenses we can work with...)

What can you do when one of your core libraries is causing most of your pain. (Bugs, performance, hacks) - rewrite it ;)

This is not a path we took lightly. But we have an alpha/beta working version of the library, and its about to go into our live servers. The code is a bit disorganized and could do with a bit of a clean up.. but our goals have been reached. It seems to parse and write json about 20x faster than our corresponding boost wrapped code. It performs far less allocations. It doesn't throw away type information. It does support arbitrary sized numbers. It does the right thing with unicode (except one edge case of properly converting some UTF8 into surrogate escaped UTF16 pairs).

Anyway I'll do a proper post on how to use it soon, but for now here's the github link:

https://github.com/mikeando/fastjson

I'd love people to play with it and leave feedback.

There's also a post about it on my company blog.

Monday, February 14, 2011

Poor-man sampler saves the day again.

So one of my applications was being very slow to start up when running in a VM.
Running the poormans sampler detailed in my previous post. I found that I got a heap of samples looking like this:

Thread 1 (Thread 0xb6f436f0 (LWP 1657)):
#0 0xb7828430 in __kernel_vsyscall ()
#1 0xb7370f93 in read () from /lib/tls/i686/cmov/libc.so.6
#2 0xb731aedb in _IO_file_underflow () from /lib/tls/i686/cmov/libc.so.6
#3 0xb731dcc8 in __underflow () from /lib/tls/i686/cmov/libc.so.6
#4 0xb731a888 in ?? () from /lib/tls/i686/cmov/libc.so.6
#5 0xb731c7b8 in _IO_sgetn () from /lib/tls/i686/cmov/libc.so.6
#6 0xb73103be in fread () from /lib/tls/i686/cmov/libc.so.6
#7 0x0811028f in main (argc=2, argv=0xbffccf54) at src/server.cpp:252

That is I was waiting for a read to complete. Sometimes for 5-10 seconds, sometimes for a few minutes.
What was this troublesome read? An unexpected socket call? Nope - something I would never have expected. Here's the "offending" code.

FILE * f = fopen("/dev/random", "r");
unsigned seed;
fread( &seed, sizeof(seed), 1, f);
fclose(f);
srandom(seed);

This code seeds the random number generators using some values pulled from the random device /dev/random.

Reading from /dev/random blocks until there is enough entropy in its internal entropy pool to complete the read. On an isolated VM there's not much system noise generating entropy and so the pool was emptying quickly.

Turns out theres a non-blocking random device that uses feedback of hashed values to prevent blocking when the entropy pool is low, so switching this to using "/dev/urandom" and all was OK.

Now maybe I could have found this by doing some traditional debugging, (maybe not as that may have generated enough noise to stop /dev/random blocking!) But using dumps from gdb the problem was found in a few minutes of time.

Sunday, February 6, 2011

Poor mans sampling profiler for live processes using GDB

While working on OSX I've got used to having the shark system profiler at my fingertips. I love being able to see what's going on in a live process, where all the threads are stuck. and what's taking up all the time on my system.

On linux you can use the oprofile kernel module, or the commercial zoom profiler (which uses a modifier oprofile under the hood I believe)

However if these aren't available to you then you can attach to your process using gdb and manually CTRL-C and backtrace / continue to get a feel for what's going on. This is suggested in several posts on stackoverflow( here and here )

A neater way to do this without ever pausing the application is..

gdb -batch -x stackdumper.gdb ./a.out 123456 > stack.0

where ./a.out is the binary you are interested and 123456 is the PID.

If you set stackdumper.gdb to contain

thread apply all backtrace

Then you'll get a backtrace on all threads. The advantage of this over the manual method is that the binary is stopped for as little time as possible.

I used this to find that all our threads were waiting on some JSON writing code that should have been fast.
i.e. a sample of about 10 runs of the sampler showed one thread deep in json decoding and 2-7 other threads all waiting in pthread mutex / condition code.