vikasing: A Note on YCSB

Recently we had to benchmark a number of In-Memory databases available, mainly open source ones. I didn't know about YCSB until my architect told me about it.

YCSB = Yahoo! Cloud Serving Benchmark

It didn't impress me at first because it was from Yahoo! no offense but Yahoo! still expects us to pay for it's email POP3 access (Yahoo! Plus), they haven't learned anything from GMail, immaturity at its best. Nevertheless we started our benchmarking with Oracle and MongoDB. I know neither of them is an in-memory database but we liked the concept of memory mapped data of MongoDB.

I wrote the Oracle client for YCSB and MongoDB client was included with the benchmark code (thanks to Yen Pai). Writing a client for YCSB is fairly simple and that's what impressed me. But my impressions were washed away by horrible glitches I found in the included drivers as well as in YCSB code itself. There are a number of forks (including mine, which is a dead one by the way) which provide a lot of patches to the original YCSB code and include many new clients as well but the owner of the project Brian Frank Cooper has a very small interest in reviewing them.

I ran the first benchmark on 1,00,000 data sets for all the work loads provided with YCSB. Default workloads are not sufficient to test all the operation properly, which forced me to create my own workload configuration. It turned out that MongoDB was just 2-4 times faster than Oracle and that didn't impressed us much. So we considered Gemfire and Hazelcast as well, both "real" in-memory databases, one open source and other commercial (a 60 day trial in this case).

Again I had to write the clients for both the new DBs and it turned out to be a piece of cake. I have to admit YCSB has a great pluggability, plugging a client for any db just requires the driver libs + some 20 lines of code and you are done . YCSB can also run on multiple machines. YCSB offers a great platform for benchmarking any kind of database out there and same should be realized by Yahoo! or Brian Cooper who can put some more effort in its development.

Here are the results of MongoDB, Gemfire and Hazelcast benchmarks on 100000 data sets:

Operation (100,000)	DBs Throughput (operations/sec)
	Gemfire	MongoDB	Hazelcast
Write (ops/sec)	3032.324	5123.475	3709.336
Read (ops/sec)	7634.170	7825.338	4315.367

MongoDB turns out to be the winner, the reason which I can think of is that both Gemfire and Hazelcast use JVM but MongoDB leaves everything to OS by mapping the data into memory.

More about YCSB can be found here and on the wiki

5 comments:

AlexJanuary 12, 2012 at 7:35 PM
There has been a contribution for a plugin for GemFire in YCSB now so that should be easier. I am curious if you can share more details on the YCSB workload file (threads, data size, read/update ratio etc) and the GemFire configuration you tested and report in this post (heap size, number of nodes, actual hardware etc).
(disclaimer: I currently work for VMware on GemFire and related technologies)
VikFebruary 6, 2012 at 7:09 PM
@Alex
Server specs: Xeon, 2 quad CPUs, 2.53 GHz, RAM 8
Data set size:1KB
Ratio: 100% writes/ 100% reads
Threads:1
Nodes:1
Heapsize:2GB

Hope this helps!
-vik
AnonymousJuly 15, 2012 at 5:26 AM
nice is there a good tutorial on how to add a new database to the YCSB framework
VikAugust 19, 2012 at 12:54 PM
@Ano....
ya there is a tutorial about adding a new DB in YCSB:
https://github.com/brianfrankcooper/YCSB/wiki/Adding-a-Database
AnonymousMarch 5, 2022 at 3:17 AM
Harrah's New Orleans Casino & Hotel - Mapyro
Harrah's New Orleans Casino 안양 출장마사지 & Hotel · The Orleans Hotel & 부산광역 출장마사지 Casino 포천 출장샵 · Casino Tower. 1 Harrahs Blvd New Orleans, LA 70130. 의정부 출장안마 Rating: 경상남도 출장안마 7.7/10 · ‎2,406 reviews · ‎Price range: $$$

10 July 2011

A Note on YCSB

5 comments: