Wednesday, March 2, 2011

Introducing the fastjson library

So using my poor mans sampler (and the awesome shark tool on OS X) we found that  my application spends a lot (90%) of its time serializing and deserializing json. That's "not good".

I've had a lot of pain finding a good json library for C or C++. Some are one-way (encode or decode only). Some wont handle large numbers (uint64_t at least). Some can't handle unicode.

The boost version based on boot::property tree looked very promising at first. But it has a lot of nasty edge cases. (Unicode is broken unless you're using wchar_t, value nodes lose their type and become strings, certain edge cases wont serialize properly). We managed to work around all these issues, patching the boost libraries and putting hacks into our code. But the insurmountable problem is that it is SLOW.

Now I could look for another library and work around its idiosyncrasies, but boost is already the second library that we've tried to work around.. and none of the other libraries look promising (or have licenses we can work with...)

What can you do when one of your core libraries is causing most of your pain. (Bugs, performance, hacks) - rewrite it ;)

This is not a path we took lightly. But we have an alpha/beta working version of the library, and its about to go into our live servers. The code is a bit disorganized and could do with a bit of a clean up.. but our goals have been reached. It seems to parse and write json about 20x faster than our corresponding boost wrapped code. It performs far less allocations. It doesn't throw away type information. It does support arbitrary sized numbers. It does the right thing with unicode (except one edge case of properly converting some UTF8 into surrogate escaped UTF16 pairs).

Anyway I'll do a proper post on how to use it soon, but for now here's the github link:

https://github.com/mikeando/fastjson

I'd love people to play with it and leave feedback.

There's also a post about it on my company blog.