tag:blogger.com,1999:blog-33268231963050762922024-02-20T01:20:58.575+10:30The Robots BrainMusing and notes of a coderMichael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.comBlogger43125tag:blogger.com,1999:blog-3326823196305076292.post-77927597124572164272018-06-22T14:34:00.002+09:302018-06-22T14:35:27.383+09:30Finding changes between branches with git.I'm often looking for a way to find the particular changes between two branches in git. I know I can use <br />
<span style="background-color: #2a2a2a; font-family: monospace;">git log --pickaxe</span>.<br />
But it can be tricky to get concise output from this. <br />
<br />
I've added an alias to my <br />
<span style="background-color: #2a2a2a; font-family: monospace;">.gitconfig</span><br />
file to get me the info I want at a glance. <br />
<br />
<div style="background-color: #2a2a2a; font-family: monospace;"><pre>[alias]
findchanges = "!f() { revision=$1; shift ; ⏎
… for x in $(git diff $revision --name-status | cut -f 2) ; do ⏎
… git diff $revision -U0 -- $x | grep '^[+-] ' | sed 's#^#'$x': #' ; ⏎
… done | grep \"$@\" ; ⏎
… } ; f"
</pre></div><br />
(Note: This should all be one line - I've had to break it for readability)<br />
<br />
This lets me do the following<br />
<br />
<div style="background-color: #2a2a2a; font-family: monospace;"><pre>> git findchanges my_branch -i todo
fileA.c: + // TODO: We really should fix this
fileB.c: - // TODO: Make sure we tweak the frobnitz later
</pre></div><br />
Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-7273645146406424232014-05-06T13:43:00.000+09:302014-05-06T13:43:27.174+09:30Profiling IO in java on linux<p>This is more of a note to myself, but others may find it useful.</p><br />
<p>When you want to see what bottlenecks are being hit by some java process that you're running you can do the following:<p><br />
<ol><li>Repeatedly obtain the backtrace using <tt>jstack PID</tt>. If you're repeatedly seeing something blocking in a read/write related call then you're IO bound</li>
<li>Check what files are open by the process using <tt>lsof PID</tt>. Its likely that one of these will look suspect</li>
<li>Watch reads to those open files using <tt>strace -f -p PID -e trace=none -e write=FD1,FD2,... -e read=FD1,FD2,...</tt><br />
</ol><br />
<br />
<br />
Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-22769664341995709232014-03-07T13:13:00.003+10:302014-07-10T17:12:08.955+09:30Cleaning up whitespace additions in git<p>I wrote a previous post on cleaning up end of line whitespace using git... But I've now run into a similar issue, the IDE used by several of the devlopers on my project<br />
likes to change surrounding whitespace indentation of lines. This means if you change just one line, the indentation changes will swamp the real changes, making the diff unhappy to look at.</p><br />
<p>You can view a diff without showing whitespace changes in git using <code>--ignore-space-change</code> or <code>--ignore--all-space</code>. However unforunately the command<br />
</p><pre>git rebase -f master --ignore-space-change
</pre><p>doesn't do what one might hope.<br />
</p><br />
<p>However all is not lost. You can get the same kind of behaviour by plumbing together <code>git format-patch</code> and <code>git am</code>.</p><br />
<pre>git branch fixed 40caad7
git checkout fixed
git format-patch --stdout --ignore-all-space fixed..original | git am --ignore-whitespace
</pre><br />
<p>This takes the the changes from <code>40caad7</code> to <code>original</code> and applies them to the new <code>fixed</code> branch, but removes/fixes whitespace changes.<br />
</p><br />
<p>however this is not nice to use - so you can wrap it up into a little git alias like this: (newlines inserted for clarity - you'll need to remove them if you use it)<br />
</p><br />
<pre>[alias]
cleanwhite = "!f() {
⏎ orig=$(git rev-parse HEAD) ;
⏎ mergebase=$(git merge-base HEAD $1) ;
⏎ git reset --hard $1 ;
⏎ git format-patch --stdout --ignore-all-space $mergebase..$orig | git am --ignore-whitespace ;
⏎ } ; f"
</pre><br />
Now cleaning up your current base is as simple as<br />
<br />
<pre>$ git cleanwhite master
</pre><br />
Which will clean up all commits whitespace from your current commit back to where you diverged from <code>master</code>.<br />
Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-6485747362557109112012-10-20T13:22:00.003+10:302012-10-20T13:22:46.820+10:30Finding changes in ugly XML with git and xmllint<p>This is probably not going to be useful to very many people, but it helped me track down a small bug, and I'm preserving it mostly incase I need something similar later.</p><br />
<p>So the key issue here is that we have some XML that is stored in git. Unfortunately this generated XML is not nicely formatted. Thus changes in git don't show nicely using <code>git log</code> or <code>git diff</code>.</p><br />
<p>My technique was this:<br />
<ol> <li>Find the commits that changed the file of interest.<br/><br />
<code>git log --oneline afile.xml | awk '{print $1}'</code><br />
</li>
<li>Get the file at that revision.<br/><br />
<code>git show $REVISION:afile.xml</code><br />
</li>
<li>Get the file at the previous revision<br/><br />
<code>git show $REVISION~1:afile.xml</code><br />
<li>Pass these through <code>xmllint --format</code> to clean them up</li><br />
<li><code>diff</code> the cleaned up versions</li><br />
</ol>This sounds pretty complex, but it can be wrapped up into a concise piece of bash script.
<script class="brush: bash" type="syntaxhighlighter">
<![CDATA[
git_pretty_xml() { git show $1 | xmllint --format - }
for REV in $(git log --oneline afile.xml | awk '{print $1}')
do
echo $REV
diff -u <( git_pretty_xml $REV~1:afile.xml ) <( git_show_xml $REV:afile.xml )
done
]]>
</script>
<br />
Now its not going to work across merges etc, but the general technique can be handy.<br />
<br />
I suspect a similar thing may have been obtainable by setting a custom diff tool in git but I couldn't see an easy way to get exactly what I wanted.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-4013187533183999172012-10-03T18:34:00.000+09:302012-10-03T18:34:02.689+09:30Partial template specialization for functions in C++<h2>Partial template specialization for functions in C++</h2><div>The short of it is you can't do it. But you can do something that looks just like it.<br />
</div><h3>What is template specialization?</h3><div>For functions template specialization looks like this:<br />
<br />
<script class="brush: cpp" type="syntaxhighlighter">
<![CDATA[
template<int i, int j>
void say_something() {
std::cout<<"Hello "<<i<<","<<j<<std::endl;
}
template<>
void say_something<7,7>() {
std::cout<<"Double 7 aren't you lucky"<<std::endl;
}
int main() {
say_something<1,1>(); // prints "Hello 1,1"
say_something<2,3>(); // prints "Hello 2,3"
say_something<7,7>(); // prints "Double 7 aren't you lucky"
}
]]>
</script><br />
So the first template tells us what to do in general and the specialization tells us what to do in particular cases.<br />
</div><br />
<h3>What is partial template specialization?</h3><div>Its just like template specialization, but you're not specifying all the template parameters. So for the example above we might like to add a specialization that prints "Lucky 7" when the first template argument is 7. If we could write it, it would look like this (but its not valid C++)<br />
<br />
<script class="brush: cpp" type="syntaxhighlighter">
<![CDATA[
template<int j>
void say_something<7,j>() {
std::cout<<"Lucky 7"<<std::endl;
}
]]>
</script><br />
<br />
However this doesn't work - it's not valid C++.<br />
</div><br />
<h3>What is the issue?</h3><div>C++ does not allow function partial specialization. <br />
</div><h3>How do we get around it?</h3><div>C++ <b>does</b> allow partial template specialization for <b>classes</b> (and structs). So our solution is just to defer to a templated helper class with static functions:<br />
<br />
<script class="brush: cpp" type="syntaxhighlighter">
<![CDATA[
//The base case.
template<int i, int j>
struct say_something_impl {
static void say_somethng() {
std::cout<<"Hello "<<i<<","<<j<<std::endl;
}
};
//Our first complete specialization
template<>
struct say_something_impl<7,7> {
static void say_somethng() {
std::cout<<"Double 7 aren't you lucky"<<std::endl;
}
};
//Our partial specialization
template<int j>
struct say_something_impl<7,j> {
static void say_somethng() {
std::cout<<"Lucky 7"<<std::endl;
}
};
// Our function that we wish we could specialise
// It now defers to the implementation templates
template<int i, int j>
void say_something() {
say_something_impl<i,j>::say_something();
}
int main() {
say_something<1,1>(); // prints "Hello 1,1"
say_something<1,7>(); // prints "Lucky 7"
say_something<7,7>(); // prints "Double 7 aren't you lucky"
}
]]>
</script><br />
<br />
Its verbose, but it works.<br />
<br />
<b>NOTE</b>: in this case we're working with `int` parameters that would be better done with a normal function containing a couple of `if` statements - however all this works for type templates too.<br />
<br />
<br />
</div>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com1Perth WA, Australia-31.932854 115.86194-32.7953185 114.5985125 -31.070389499999997 117.12536750000001tag:blogger.com,1999:blog-3326823196305076292.post-29517850714267028352012-04-19T13:12:00.000+09:302012-04-19T13:12:43.693+09:30Faking watch on OS X<p>Sometimes watching something change over time in your terminal is kinda helpful. Linux has a great utility for this called "watch". OS X doesn't some with this. Of course you can build it yourself if you want .. but that can mean a chunk of pain too... luckily bash comes with what you need. "while" and some ANSII escape sequences are enough to get you going.</p>
<p>Here was my first try:</p>
<script class="brush: bash" type="syntaxhighlighter">
<![CDATA[
while true
do
clear
git log --graph --all
sleep 4
done
]]>
</script>
<p>The first problem with this for me was that in iterm2 clear just keeps adding to your terminal scrollback in some weird way. Luckily replacing this with some ANSII control sequences fixes that up nicely.</p>
<script class="brush: bash" type="syntaxhighlighter">
<![CDATA[
while true
do
echo -n -e "\x1b[2J\x1b[H"
git log --graph --all
sleep 4
done
]]>
</script>
<p>Now it doesn't add to the scrollback but it does blink every 4 second. I fixed this by storing the current and previous results in temp files and only clearing if they've changed.</p>
<script class="brush: bash" type="syntaxhighlighter">
<![CDATA[
tempfoo=`basename $0`
TMPFILE0=`mktemp -q -t ${tempfoo}`
if [ $? -ne 0 ]
then
echo "$0: Can't create temp file, exiting..."
exit 1
fi
TMPFILE1=`mktemp -q -t ${tempfoo}`
if [ $? -ne 0 ]
then
echo "$0: Can't create temp file, exiting..."
exit 1
fi
while true
do
git log --all -n 50 > $TMPFILE0
diff -q $TMPFILE0 $TMPFILE1 2>&1 > /dev/null
if [ $? -ne 0 ]
then
echo -n -e "\x1b[2J\x1b[H"
git log --graph --all -n 50
mv $TMPFILE0 $TMPFILE1
fi
sleep 4
done
]]>
</script>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-19530138480422454482012-03-24T11:34:00.000+10:302012-03-24T11:34:02.724+10:30Farewell Tim<br />
It has been just over two weeks since my little brother Tim was killed. He was hit by a drunk driver while cycling - training for the Ride For Youth, a charity ride raising money for young people at risk of self harm or suicide.<br />
<br />
The ride for youth requires serious dedication from those participating, riding 640km over 5 days. They are expected to train with the team at least 3 days a week for 6 months. In the week before his death Tim rode over 400km. My Dad has been doing the ride for several years, and this year was going to be his last and Tim's first. Tim was athletic all his life, but with this training he was fitter than he'd ever been.<br />
<br />
Tim had an amazing group of friends. He touched so many lives. We've had messages of sympathy and support from all over the world. We estimate somewhere between 700 and 1000 people attended his funeral. Through his studies and work as Chemical Engineer he lived all over the world: Perth, Brisbane, Dubai, Wales, Switzerland and more. Everywhere he went he made great lifelong friends. To Tim everyone was a potential friend.<br />
<br />
Tim was also intelligent. He had left his work to pursue a Masters degree in Chemical Engineering, and was considering upgrading that to a Ph.D. Amazingly Tim was using many of the tools I'd studied in my Ph.D. and post-doc. In the last year we'd had conversations about such esoteric topics as "Convergence and Stability of Finite Element Methods."<br />
<br />
Finally, Tim was an amazing uncle to my daughter Kira, and my son Grant. Even after a training ride of 140km Tim would still have the energy to run around the table with Kira playing chasey. He'd leave the social events early to take Kira to the beach. Kira is going to miss him terribly, and I'm so sad that Grant will never get to know him. I was so looking forward to Tim teaching them both things like surfing and basket ball. (Both of which I'm useless at.)<br />
<br />
<br />
<br class="Apple-interchange-newline" />If you want to support Tim's cause, don't send us flowers, donate here:<br />
<a href="http://www.rideforyouth.com.au/riders/enjo/tim-anderson">http://www.rideforyouth.com.au/riders/enjo/tim-anderson</a><br /><br />Some further information about Tim and the accident can be found here:<br />
<a href="http://www.facebook.com/remembertimbo">www.facebook.com/remembertimbo</a><br />
<br />
<br />
I don't know what else to say except<br />
<br />
Goodbye Tim,<br />
We're really going to miss you.<br />
<br />Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-69918434313669154452012-03-02T18:20:00.000+10:302012-03-02T18:20:59.920+10:30Installing node.js on OS X 10.5<p>The binary distributions of node.js no longer work on 10.5 (at least no distribution I could find.). So I went about building my own from source. There's several pitfalls I had to overcome, so I figured I'd list the solutions here.</p>
<p>Step 1. Download the source. I did this using:
<pre>
git clone git://github.com/joyent/node.git
git checkout origin/v0.6.11-release
</pre>
</p>
<p>Step 2. Patch the included v8 source. If you dont patch it you get an error about missing symbols for <code>Dictionary<SeededNumberDictionaryShape, uint32_t>::SlowReverseLookup</code>, although they may be mangled, so it's not to obvious.
<pre>
diff --git a/deps/v8/src/objects.cc b/deps/v8/src/objects.cc
index 88ebbf4..c4aea1c 100644
--- a/deps/v8/src/objects.cc
+++ b/deps/v8/src/objects.cc
@@ -10012,6 +10012,9 @@ template Object* Dictionary<UnseededNumberDictionaryShape, uint32_t>::
template Object* Dictionary<StringDictionaryShape, String*>::SlowReverseLookup(
Object*);
+template Object* Dictionary<SeededNumberDictionaryShape, uint32_t>::SlowReverseLookup(
+ Object*);
+
template void Dictionary<SeededNumberDictionaryShape, uint32_t>::CopyKeysTo(
FixedArray*,
PropertyAttributes,
</pre>
</p>
<p>Step 3. Download a new version of openssl and build and install shared versions. I downloaded version 0.9.8t and built it using:</p>
<pre>
./config shared
make
make install
</pre>
On OS X by default this installs to <code>/usr/local/ssl</code>
If you try to use the default version of openssl then you get a bunch of errors:
<pre>
../src/node_crypto.cc: In member function ‘bool node::crypto::DiffieHellman::Init(int)’:
../src/node_crypto.cc:3537: error: ‘DH_generate_parameters_ex’ was not declared in this scope
../src/node_crypto.cc: In static member function ‘static v8::Handle<v8::Value> node::crypto::DiffieHellman::ComputeSecret(const v8::Arguments&)’:
../src/node_crypto.cc:3811: error: ‘DH_check_pub_key’ was not declared in this scope
../src/node_crypto.cc:3814: error: ‘DH_CHECK_PUBKEY_TOO_SMALL’ was not declared in this scope
../src/node_crypto.cc:3817: error: ‘DH_CHECK_PUBKEY_TOO_LARGE’ was not declared in this scope
</pre>
If you forget to build a shared version then node will almost compile, but complain about missing symbols at link time for the final binary.
</p>
<p>Step 4. Build node using the changes you've made:
<pre>
./configure --openssl-includes=/usr/local/ssl/include/ --openssl-libpath=/usr/local/ssl/lib/
make
make install
</pre>
</p>
<p>Now hopefully you'll have a running version of node installed.
</p>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-58722146903330001592012-02-09T14:50:00.001+10:302012-02-09T14:52:19.900+10:30Javascript objects are not hashes<p>I've some across several posts from people warning about using objects as maps( [<a href="http://www.2ality.com/2012/01/objects-as-maps.html">1</a>], [<a href="http://www.devthought.com/2012/01/18/an-object-is-not-a-hash/">2</a>] ), and while they're right I think they miss an important feature. <b>Objects are not hashes.</b> Instead they try to show you how to use them like objects.</p>
<p>The key to the whole issue is the difference in behavior of <code>in</code> and <code>hasOwnProperty</code>.</p>
<p>For example if I want to set a property in an object using a user supplied string I would do this:</p>
<script type="syntaxhighlighter" class="brush: javascript"><![CDATA[
posts = {}
if( slug in posts )
{
//Its got the property, but its from a parent class
if( ! posts.hasOwnPropert(slug) )
{
do_error("Invalid post name");
}
else
{
//Its already got content so append to it.
posts[slug] += content;
}
}
else
{
posts[slug] = content;
}
]]></script>
<p>This makes it clear that assigning & updating a field is a ternary (append/update/fail) rather than a binary issue (append/update).</p>
If you're feeling paranoid about already broken data, then you might change the use of <code>posts.hasOwnProperty</code> to <code>Object.prototype.hasOwnProperty.call(posts,slug)</code>. But this code should prevent <code>hasOwnProperty</code> from getting overwritten in the first place.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-77615316617502017242011-07-05T09:38:00.001+09:302012-04-19T13:13:40.473+09:30Examining multiple cores with gdb and grep<p>So you've got a bunch of core files from a failing application. You look in one of them and find what's causing that particular error. However you're not sure whether all the cores are caused by the one same issue.</p>
<p>The application I was having trouble with was a multithreaded app, so I couldn't rely on the stacs being identical every time... so this is what I did.</p>
<b>stackdumper.gdb</b>
<script type="syntaxhighlighter" class="brush: text"><![CDATA[
thread apply all backtrace
]]></script>
Then in bash:
<script type="syntaxhighlighter" class="brush: text"><![CDATA[
for x in core.1309*
do echo $x
gdb -batch -x stackdumper.gdb ./server.x $x | grep SomeFunctionThatIsCausingCrashes
done
]]></script>
<p>In my case I found that string in all 80 cores, so I know that was the only issue.</p>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-87127182052649507372011-06-02T21:43:00.002+09:302012-02-09T14:52:32.542+10:30Javascript databindings with observersThis is not quite as clear as the previous post, but heres a version of my javascrip prototyping with observers on the data.
Again you can play with it using <a href="http://jsfiddle.net/gh/gist/jquery/1.6/1004318/">jsfiddle</a> .. or heres the code.
<script src="https://gist.github.com/1004318.js?file=fiddle.html"></script>
<script src="https://gist.github.com/1004318.js?file=fiddle.js"></script>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-71801609871559181502011-06-02T13:33:00.001+09:302012-02-09T14:53:41.642+10:30Creating data bindings using javascript closuresI've been working on javascript codebase that stores all its data in forms, hiding and showing all the form elements as the view changes. This has worked well while the structure of the data was relatively rigid. But now we've got dynamically structured (tree-like) data we need to be able to grow and shrink the views in more flexible ways.<br />
<br />
While I'm not a gung-ho MVC advocate, it certainly looked like the app could do with a bit of an MVC style cleanup (the data was in the view rather than being accessed by the view).<br />
<br />
The problem was it wasn't clear how to get the data out of the view... here's a simplified version of the method I'm planning to use.<br />
<br />
<script src="https://gist.github.com/1003899.js?file=fiddle.html">
</script>
<script src="https://gist.github.com/1003899.js?file=fiddle.js">
</script>
<br />
You can also play with this using <a href="http://jsfiddle.net/gh/gist/jquery/1.6/1003899/">jsfiddle</a><br />
<br />
I'm thinking about implementing broadcasting of changes and more model based approach later... but maybe that'll have to be another post.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-72296243016655972702011-05-18T13:47:00.000+09:302011-05-18T13:47:11.277+09:30Gits prepare-commit-message hookSo I often write bad commit messages. At best they're inconsistent - sometimes with ticket ids at the start sometimes with ticket ids at the end. Sometimes only a short message, sometimes a nicely formatted bullet point list.<br />
<br />
Thankfully git has a tool to help you get these things consistent. Its called the "prepare-commit-msg hook"<br />
<br />
I wrote a little python script to make a default commit message. Its not perfect but it should help... It takes the branch name, searches for a version tag and removes it, then searches for anything that might be a ticket id and adds it to the message. Finally it adds some boilerplate.<br />
<br />
<script src="https://gist.github.com/977979.js?file=prepare-commit-msg.py"></script>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-84178952534938390352011-05-18T13:37:00.001+09:302011-05-18T13:39:38.152+09:30Command line git trees<p>Often I'm stuck in a terminal and want to see the commit history in a nice tree format.
There's a nice solution ... git log recently learnt the "--graph" feature.</p>
<p>You can get pretty nice results from "--graph --oneline"</p>
<script type="syntaxhighlighter" class="brush: text"><![CDATA[
bash> git log --graph --oneline
* 3b360c7 Made the Foozle boogie correctly
* f3b5485 Added a Foozle
* 7756624 Fix for unfrobnicated widget
|\
| * 16fbd87 Frobnicated the widget
| * 79e0b78 Added the foozle
|/
* 94d6b67 Start of version v2.1
]]></script>
<p>But using the tricks from <a href="http://www.jukie.net/bart/blog/pimping-out-git-log">http://www.jukie.net/bart/blog/pimping-out-git-log</a> we can get a much nicer result - including author and branches and time all coloured nicely</p>
<script type="syntaxhighlighter" class="brush: text"><![CDATA[
bash> git lg
* 3b360c7 - (HEAD, feature_188) [#188] Made the Foozle boogie correctly (2 hours ago) <Michael Anderson>
* f3b5485 - [#188] Added a Foozle (2 hours ago) <Michael Anderson>
* 7756624 - (v2.1_dev) [#271,#277 complete] Fix for unfrobnicated widget (3 hours ago) <Michael Anderson>
|\
| * 16fbd87 - (feature_271_277) [#271,#277] Frobnicated the widget (3 hours ago) <Michael Anderson>
| * 79e0b78 - [#271] Added the foozle (3 hours ago) <Michael Anderson>
|/
* 94d6b67 - (v2.1_base) Start of version v2.1 (3 hours ago) <Michael Anderson>
]]></script>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-4686542657352599262011-04-29T18:58:00.001+09:302011-04-29T18:59:16.687+09:30Rendering maths from a web pageOccasionally I do pretty math intensive coding.<br />
And one of the best pieces of software for writing about maths is <a href="http://en.wikipedia.org/wiki/TeX">TeX</a>. <br />
It was written by one of the gurus/fathers of modern computer science in 1978, and is still in use all over the world.
So when it comes to putting math on the web, it used to be a workflow like this
<ul>
<li>render math using tex</li>
<li>copy and paste somewhere</li>
<li>add an image link to your docs.</li>
</ul>
However some clever people have made that process so much easier: Two efforts in particular look very nice
<ul>
<li><a href="http://code.google.com/apis/chart/docs/gallery/formulas.html">TeX Formulas - from the Google Chart API</a></li>
<li><a href="http://http://www.mathjax.org/">MathJax</a></li>
</ul>
<br /><br />
Heres some mathjax : $ \nabla \times \vec{\rho} = 0 $<br />
Heres the same thing using the google api:
<img src="http://chart.apis.google.com/chart?cht=tx&chl=%5Cnabla%20%5Ctimes%20%5Cvec%7B%5Crho%7D%20%3D%200&chs=&chf=&chco="/><br/><br/>
I think the mathjax is nicer, but the google charts version is easier to integrate.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-25059481866485780092011-04-01T19:40:00.001+10:302011-04-14T21:01:55.682+09:30Using saru with googlemock and googletest<a href="https://github.com/roarengine/saru">Saru</a> (<a href="http://therobotsbrain.blogspot.com/2009/10/testing-with-saru.html">earlier blog post</a>) is my little testing framework. It does everything we need. It's been used in several serious software development situations. But sometimes things are a little painful.<br />
<br />
For example, saru doesn't come with any nice mocking helpers for C++. There is a basic C++ testing library that comes with it, but writing your own mocks by hand is one of those painful things I mentioned.<br />
<br />
Thankfully there's a nice mocking library for C++ from the folks at google, called oddly enough, <a href="http://code.google.com/p/googlemock/">googlemock</a>. However googlemock is designed to work with <a href="http://code.google.com/p/googletest/">googletest</a> - the google testing framework.<br />
<br />
Googletest is also great. But its orthogonal to saru, rather than competitive. Saru is cross-language and designed to be more of a test-running wrapper, while googletest is a c++ unit testing library.<br />
<br />
So I had three options if I wanted to use google-mock with my code and saru.<br />
<ol>
<li> Make the google-mocks work with the saru-cxx library.</li>
<li> Make google-test output in a format that saru could digest.</li>
<li> Make saru able to parse google-test output.</li>
</ol>
IMO the third is the wisest and most extensible option. Luckily the <a href="https://github.com/roarengine/saru/commit/676f6eaf3e079dab71d82578d6239c75d32e3f2c">changes</a> were pretty easy.<br />
<br />
So now getting a google-test file working in saru is as trivial as adding a
<br />
<pre class="prettyprint lang-cpp">
\\SARU : Format gtest
</pre>
to the top of the test file.. and everything just works :)<br />
(Well you'll need to make sure the compiler can find the right includes and the gtest library.. but thats all)<br />
<br />
For example I get this kind of output when running a test suite.<br />
<pre>99-misc-00-periodic-processor.cpp::TestFixture::test_process_many : OK
gmock_test.cpp::PartyTest.CallsDance : OK
gmock_test.cpp::PartyTest.CallsDanceFails : FAILED
==MESSAGE==
==STDERR==
gmock_test.cpp:69: Failure
Value of: p.party()
Actual: false
Expected: true
</pre>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-84208699043404033912011-03-02T10:00:00.002+10:302011-03-03T09:20:21.460+10:30Introducing the fastjson librarySo using my poor mans sampler (and the awesome shark tool on OS X) we found that my application spends a lot (90%) of its time serializing and deserializing json. That's "not good".<br />
<br />
I've had a lot of pain finding a good json library for C or C++. Some are one-way (encode or decode only). Some wont handle large numbers (uint64_t at least). Some can't handle unicode.<br />
<br />
The boost version based on boot::property tree looked very promising at first. But it has a lot of nasty edge cases. (Unicode is broken unless you're using wchar_t, value nodes lose their type and become strings, certain edge cases wont serialize properly). We managed to work around all these issues, patching the boost libraries and putting hacks into our code. But the insurmountable problem is that it is SLOW.<br />
<br />
Now I could look for another library and work around its idiosyncrasies, but boost is already the second library that we've tried to work around.. and none of the other libraries look promising (or have licenses we can work with...)<br />
<br />
What can you do when one of your core libraries is causing most of your pain. (Bugs, performance, hacks) - rewrite it ;)<br />
<br />
This is not a path we took lightly. But we have an alpha/beta working version of the library, and its about to go into our live servers. The code is a bit disorganized and could do with a bit of a clean up.. but our goals have been reached. It seems to parse and write json about 20x faster than our corresponding boost wrapped code. It performs far less allocations. It doesn't throw away type information. It does support arbitrary sized numbers. It does the right thing with unicode (except one edge case of properly converting some UTF8 into surrogate escaped UTF16 pairs).<br />
<br />
Anyway I'll do a proper post on how to use it soon, but for now here's the github link:<br />
<br />
<a href="https://github.com/mikeando/fastjson">https://github.com/mikeando/fastjson</a><br />
<br />
I'd love people to play with it and leave feedback.<br />
<br />
There's also a <a href="http://blog.roarengine.com/2011/02/fastjson-or-why-we-stopped-using-3rd.html">post</a> about it on my company blog.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-78014329264347659142011-02-14T14:08:00.001+10:302011-04-14T21:06:38.893+09:30Poor-man sampler saves the day again.So one of my applications was being very slow to start up when running in a VM.<br />
Running the poormans sampler detailed in my <a href="http://therobotsbrain.blogspot.com/2011/02/poor-mans-sampling-profiler-for-live.html">previous post</a>. I found that I got a heap of samples looking like this:<br />
<br />
<br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">Thread 1 (Thread 0xb6f436f0 (LWP 1657)):</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#0 0xb7828430 in __kernel_vsyscall ()</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#1 0xb7370f93 in read () from /lib/tls/i686/cmov/libc.so.6</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#2 0xb731aedb in _IO_file_underflow () from /lib/tls/i686/cmov/libc.so.6</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#3 0xb731dcc8 in __underflow () from /lib/tls/i686/cmov/libc.so.6</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#4 0xb731a888 in ?? () from /lib/tls/i686/cmov/libc.so.6</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#5 0xb731c7b8 in _IO_sgetn () from /lib/tls/i686/cmov/libc.so.6</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#6 0xb73103be in fread () from /lib/tls/i686/cmov/libc.so.6</span></span><br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><span class="Apple-style-span" style="font-size: x-small;">#7 0x0811028f in main (argc=2, argv=0xbffccf54) at src/server.cpp:252</span></span><br />
<br />
That is I was waiting for a read to complete. Sometimes for 5-10 seconds, sometimes for a few minutes.<br />
What was this troublesome read? An unexpected socket call? Nope - something I would never have expected. Here's the "offending" code.<br />
<br />
<pre class="prettyprint lang-cpp">
FILE * f = fopen("/dev/random", "r");
unsigned seed;
fread( &seed, sizeof(seed), 1, f);
fclose(f);
srandom(seed);
</pre>
This code seeds the random number generators using some values pulled from the random device /dev/random.<br />
<br />
Reading from /dev/random blocks until there is enough entropy in its internal entropy pool to complete the read. On an isolated VM there's not much system noise generating entropy and so the pool was emptying quickly.<br />
<br />
Turns out theres a non-blocking random device that uses feedback of hashed values to prevent blocking when the entropy pool is low, so switching this to using "/dev/urandom" and all was OK.<br />
<br />
Now maybe I could have found this by doing some traditional debugging, (maybe not as that may have generated enough noise to stop /dev/random blocking!) But using dumps from gdb the problem was found in a few minutes of time.<br />
<div>
<br /></div>
<br />
<br />
<br />Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-3369265841253115692011-02-06T15:20:00.000+10:302011-02-06T15:20:07.926+10:30Poor mans sampling profiler for live processes using GDBWhile working on OSX I've got used to having the shark system profiler at my fingertips. I love being able to see what's going on in a live process, where all the threads are stuck. and what's taking up all the time on my system.<br />
<br />
On linux you can use the oprofile kernel module, or the commercial zoom profiler (which uses a modifier oprofile under the hood I believe)<br />
<br />
However if these aren't available to you then you can attach to your process using gdb and manually CTRL-C and backtrace / continue to get a feel for what's going on. This is suggested in several posts on stackoverflow( <a href="http://stackoverflow.com/questions/266373/one-could-use-a-profiler-but-why-not-just-halt-the-program">here</a> and <a href="http://stackoverflow.com/questions/375913/what-can-i-use-to-profile-c-code-in-linux/378024#378024">here</a> )<br />
<br />
A neater way to do this without ever pausing the application is..<br />
<br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">gdb -batch -x stackdumper.gdb ./a.out 123456 > stack.0</span><br />
<br />
where ./a.out is the binary you are interested and 123456 is the PID.<br />
<br />
If you set stackdumper.gdb to contain<br />
<br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">thread apply all backtrace</span><br />
<br />
Then you'll get a backtrace on all threads. The advantage of this over the manual method is that the binary is stopped for as little time as possible.<br />
<br />
I used this to find that all our threads were waiting on some JSON writing code that should have been fast.<br />
i.e. a sample of about 10 runs of the sampler showed one thread deep in json decoding and 2-7 other threads all waiting in pthread mutex / condition code.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-75838745028679874052011-01-27T15:01:00.001+10:302011-01-27T15:02:08.330+10:30Lean Development : Focus on LearningSo continuing on with my notes on Lean Development, we're going to look at point 3 from <a href="http://www.poppendieck.com/">Tom Poppendieck</a>s business card.
<br />
<br />
3. Focus on Learning - Scientific Method; Challenge Standards; Feedback; Continuous Improvement<br />
<br />
Lets start from the back of the list and work to the front, as I find that to be the logical progression of the sub-points.<br />
<br />
Continuous Improvement is about continually tweaking your production pipeline and/or the product so that you become more efficient in producing what the customer wants.<br />
<br />
Feedback is how you evaluate the changes you make to your pipeline. If your metrics improve, then accept the change, if not work out why it didn't and undo the change.<br />
<br />
Challenge Standards - This is a cultural attitude of not accepting the status quo. Its the need for the people involved to have the desire and ability to change the system. If something is a standard then there needs to be a reason for why. If not its arbitrary and subject to getting changed (if the change can be demonstrated to be beneficial)<br />
<br />
Scientific Method - This is how we go about getting change implemented. Firstly you hypothesise measurable results from the change will do, then you make the change. If what you hypothesised didn't happen then you need to undo it and reevaluate. "Switching to a frooble compiler should reduce test execution times by 50%, increasing throughput by 3 story-points per week". Results and experiments should be documented somewhere so that the company can learn from what was done (positive or negative).<br />
<br />
If we do these things we will understand how our process works and will have a system so that the process can change to meet the changing environment in which the development pipeline exists.<br />
<br />
<br />Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-53489325588762232872011-01-24T14:55:00.002+10:302011-01-24T15:00:35.489+10:30Lean Development : Build Quality InThis is the fourth post in my mini-series about Lean Development.<br />
<br />
The bullet point I'm going to be writing about today is:<br />
<ul>
<li>Mistake-Proof with TDD; Write No New Legacy; Continuous Integration</li>
</ul>
<div>
TDD is "Test Driven Design". This means not only testing all your code, but writing your tests before you code. And letting your design be driven by the issues that arise while making these tests pass. This also holds for bug fixing -- create a test case that reproduces and narrows down the bug, then make that test pass.</div>
<br />
<div>
"Write No New Legacy" means don't write code that is hard to maintain. This means modular and easily testable code. Make sure the hard bits of the code are documented.</div>
<div>
<br />
Continuous Integration means that your code should be being built and tested on every check in. Broken builds should be addressed immediately. Implicit in this is some kind of version control system. Often this is a larger batch of tests than the specific unit / functional tests used while designing a single feature. Ideally this will run on multiple test systems, for each system you deploy to.</div>
<br />
<div>
One idea behind these points is to make it hard for an unnoticed error to reach production. TDD makes it difficult to make the error in the first place. "Write No New Legacy" means you shouldn't have tricky untestable code paths to trip you up. "Continuous Integration" means you should never have a "broken system" on your hands.</div>
<br />
<div>
The bigger idea behind these points is continuous improvement of the code-base. If all your tests from TDD get plugged into the continuous integration tests, then you should never have a bug reappear.This should give you confidence that your code is doing what it is supposed to do.</div>
<br />
<div>
The final point is that these processes make it fast to track down a bug, less time hunting bugs means more time delivering real product.</div>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-66752363547556659802011-01-24T11:25:00.001+10:302011-01-24T11:26:43.712+10:30Why use saru for testing?I've had a few questions about why I use my own little testing framework, <a href="http://github.com/squishyhumans/saru">saru</a>, for testing, rather than using something standard like cxxunit or whatever.<br />
<br />
One of the key problems is testing across multiple languages. We use C++, PHP and python for various pieces of our pipeline. Saru was designed so that plugging in a new language is easy. So if someday we need to support Java then I'm not worried. Now we could use cxxunit for C++, phpunit for php, etc, but I like to have unified reporting and a bit more integration of these tests.<br />
<br />
Another problem I have with using a pure C++ framework is sometimes I want to test the condition "<i>Class Foo should not be default constructible</i>." The easiest way for me to check that is to have a fragment of code that should <b>fail to compile</b> with given error messages. You can't test that with a C++ testing framework. (In this particular case you may be able to do something with SFINAE style template hackery, or fork and call a compiler, but these just feel very hackish)<br />
<br />
My third reason for using saru is that sometimes I want to do things that are not easily expressed in C++, but easily expressed in other languages. Things like "<i>are the source files correctly formatted?</i>", or "d<i>o all C++ exceptions have a corresponding PHP handler?</i>", or "a<i>re there tests for each of the classes in this directory?</i>". These are all C++ related questions, but much easier expressed in bash using awk/sed/grep or python than in pure C++.<br />
<br />
So thats why I use saru.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-43335692448772505682011-01-18T21:31:00.002+10:302011-01-19T13:15:54.068+10:30Lean Development : Eliminate Waste II<i>This is going to be another short post. Luckily we only have three points to cover...</i><br />
<br />
The remaining 3 points from my earlier post on the first principle of lean software development are:<br />
<ul>
<li>Over Processing</li>
<li>Over Production</li>
<li>Defects and Rework</li>
</ul>
Over processing is doing more work on a product than is required. For software this may mean performance tuning code that isn't performance critical, handling edge cases that can never occur, making code overly flexible. All these kinds of things waste time. Now the tricky part is working out what the "over" part is. Thankfully this issue is addressed in an agile style development system where you get the roughest possible solution out to users and tweak it with their feedback. If the users aren't complaining then you don't need to fix it.<br />
<br />
Over production in a production environment, means producing more stock than the consumer will consume or at a rate that leads to a build up of inventory. In a software situation I consider the features to be analogous to the product. If you are producing features that are not needed by the customer then you are over producing. In a software sense this is very similar to the over-processing.<br />
<br />
Defects and Rework, in my experience, are where a large chunk of waste in a traditional software development lie. I seem to spend a lot of time fixing bugs. So what do we do about it? Test Driven Design, Continuous Integration and "Stop the Line" are some of the tools advocated by Lean software. I guess I'll have to write about some of them in another post.<br />
<br />
<i>As I said this was going to be a short one. I'll move on to the next of the principles tomorrow... </i>Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-54708713902615915982011-01-17T21:19:00.002+10:302011-01-19T13:15:30.361+10:30Lean Development : Eliminate Waste<i>This post has taken longer than I'd hoped, and is consequently less polished than I'd hoped... but here goes...</i><br />
<br />
The first entry in our list of Lean Development concepts shows leans focus on efficiency:<br />
<blockquote>
Eliminate Waste - No Extra Features, Churn or Boundaries</blockquote>
Before we can eliminate waste we must first understand what waste is. I think of this concept as "Don't do what you don't need to do" and waste is any resource use that is not driving your core business.<br />
<br />
Here is a list of 7 deadly wastes - which I believe come from the original Toyota lean methodology.<br />
<br />
<ol>
<li>Transportation</li>
<li>Inventory</li>
<li>Motion</li>
<li>Waiting</li>
<li>Over-Processing</li>
<li>Over-Production</li>
<li>Defects and Rework</li>
</ol>
<div>
In most of these cases the resource that is being wasted is time. </div>
<div>
I'm going to cover the first 3 today, and more tomorrow.</div>
<div>
<br /></div>
<div>
Transporting a commodity from A to B takes time and may have a monetary cost. But in a digital world this is less applicable. However if you think of this as meaning inefficiencies in your supply chain this may make more sense. How long does it take you to get a "ready" version of the code out to your customers? How painful is this process? How robust? What are you doing in the process that can be automated or removed? Any manual interaction that is not needed is wasted effort. Any unneeded delays are wasted time. </div>
<div>
<br /></div>
<div>
Inventory in a software sense is completed features not shipped to the users. A feature is not providing value to a customer when it is sitting in your development version of the software. The sooner you can get a completed (and tested!) feature out to you users, the sooner it provides value to them, and thus dirves value for you. Of course if every push to the users is taking a couple of days of effort then you have a transport issue that needs fixing first.</div>
<br />
<br />
Motion, to me means one developer (or task) doing unnecessary actions as part of its progress through the conceptual pipe from idea to implementation. These often crop up as bureaucracy - paper work that will be discarded, double entry into multiple bug tracking systems, emails to supervisors, stuff that makes a developer busy but not productive. This is wasted effort and time.<br />
<br />
Waiting.. this usually means waiting for feedback from another party, or waiting for compilation, or waiting for tests to run. The developer ends up doing nothing productive, or is less-productive due to context switching between tasks. So parallelize or speed up the process. Tests should be near instantaneous - your full test suite takes an hour to run - split it up across 10 machines and it should take 6 minutes. Compilation should be fast (and incremental) - use ccache, distcc or something like that. Feedback should be fast or at least predictable.<br />
<br />
<i>Well thats it for today, as I said its a bit choppy butt hopefully you got something out of it.</i><br />
<br />Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0tag:blogger.com,1999:blog-3326823196305076292.post-58675350745561370092011-01-16T10:32:00.001+10:302011-01-17T21:19:57.198+10:30Lean DevelopmentI've been looking for something I could write a series of posts about .. and I stumbled across <a href="http://www.poppendieck.com/">Tom Poppendieck</a>'s business card. An odd thing to blog about you might think, but on the back of the card are 7 core ideas of Lean software development. I'm just going to repeat them here, then elaborate on what they mean to me. Lean is a style of development that I've been trying to move towards it focuses heavily on efficiency, blending ideas from Agile Development with ideas learn from large scale manufacturing from Japan. An odd combination, but it feels right to me.<br />
<br />
So here's the list verbatim from Tom's card:<br />
<br />
1. Eliminate Waste - No Extra Features, Churn or Boundaries<br />
2. Build Quality In - Mistake-Proof with TDD; Write No New Legacy; Continuous Integration<br />
3. Focus on Learning - Scientific Method; Challenge Standards; Feedback; Continuous Improvement<br />
4. Defer Commitment - Break Dependencies; Maintain Options; Irreversible Decisions at Last Responsible Moment<br />
5. Deliver Fast - Low Cost and Quality and Speed; Queuing Theory: Flow, Limit Work to Capacity<br />
6. Respect People - Pride, Commitment, Trust and Applause; Effective Leadership; Respect Partners<br />
7. Optimize the While - Measure Up, Avoid Sub-Optimization; While Value Stream & Whole Product<br />
<br />
Oh and just a note, any mistakes here are purely my own, as are any opinions presented. But credit for the ideas really belongs elsewhere.Michael Andersonhttp://www.blogger.com/profile/00877984226868333502noreply@blogger.com0