10 July 2011

A Note on YCSB

Recently we had to benchmark a number of In-Memory databases available, mainly open source ones. I didn't know about YCSB until my architect told me about it.
YCSB = Yahoo! Cloud Serving Benchmark
It didn't impress me at first because it was from Yahoo! no offense but Yahoo! still expects us to pay for it's email POP3 access (Yahoo! Plus), they haven't learned anything from GMail, immaturity at its best. Nevertheless we started our benchmarking with Oracle and MongoDB. I know neither of them is an in-memory database but we liked the concept of memory mapped data of MongoDB.

I wrote the Oracle client for YCSB and MongoDB client was included with the benchmark code (thanks to Yen Pai). Writing a client for YCSB is fairly simple and that's what impressed me. But my impressions were washed away by horrible glitches I found in the included drivers as well as in YCSB code itself. There are a number of forks (including mine, which is a dead one by the way) which provide a lot of patches to the original YCSB code and include many new clients as well but the owner of the project Brian Frank Cooper has a very small interest in reviewing them.

I ran the first benchmark on 1,00,000 data sets for all the work loads provided with YCSB. Default workloads are not sufficient to test all the operation properly, which forced me to create my own workload configuration. It turned out that MongoDB was just 2-4 times faster than Oracle and that didn't impressed us much. So we considered Gemfire and Hazelcast as well, both "real" in-memory databases, one open source and other commercial (a 60 day trial in this case).

Again I had to write the clients for both the new DBs and it turned out to be a piece of cake. I have to admit YCSB has a great pluggability, plugging a client for any db just requires the driver libs + some 20 lines of code and you are done . YCSB can also run on multiple machines. YCSB offers a great platform for benchmarking any kind of database out there and same should be realized by Yahoo! or Brian Cooper who can put some more effort in its development.

Here are the results of MongoDB, Gemfire and Hazelcast benchmarks on 100000 data sets:

Operation (100,000)
DBs Throughput (operations/sec)

Gemfire
MongoDB
Hazelcast
Write (ops/sec)
3032.324
5123.475
3709.336
Read (ops/sec)
7634.170
7825.338
4315.367

MongoDB turns out to be the winner, the reason which I can think of is that both Gemfire and Hazelcast use JVM but MongoDB leaves everything to OS by mapping the data into memory.

More about YCSB can be found here and on the wiki

26 June 2011

Find Me Lazy

I was supposed to write a New Year post six months back, I didn't. Someone last week asked me what are you best at, I didn't (or couldn't) answer, now I guess laziness is what I am best at. Last year's new year post can be found here which I posted just after 3 days of new year. This year I am late by just 6 months. So I'll summarize whats happened in last 18 months span:

Series I finished:
The Wire
OZ
Generation Kill
Six Feet Under
Twin Peaks
The Life and Times of Tim
24
The Lost Room
The Pacific
Daria
Long Way Round
An Idiot Abroad
Spartcus: Blood and Sand

Series which I started following
Breaking Bad
Game of Thrones
In Treatment
Its Always Sunny In Philadelphia
The Ricky Gervais Show
The Venture Bros.
The IT Crowd
Boardwalk Empire
Fringe

Apart from series, movies and games; few more insignificant things happened in my life:
Started a project Crowl and released first revision (0.1)
Shifted to Noida from Bangalore.
Started gizmoage.com.
Finished a couple of novels.
Finished following games:
  • Crysis 2
  • Blur
  • Battlefield: Bad Company 2
  • Need for Speed: Hot Pursuit
  • Call of Duty Black Ops
  • Call Of Duty Modern Warfare 2
  • Just Cause 2
To add to the list I bought a car and still learning how to drive with L sign on front as well as back. In Jack Sparrow way: the feeling which someone should have after getting a car, I don't have it.

Caution: A Blurry Pic Ahead!