LightCloud
戚成礼
2023-12-01
[size=x-large]http://opensource.plurk.com/LightCloud/
Distributed and persistent key-value database
Features
Built on Tokyo Tyrant. One of the fastest key-value databases [benchmark]. Tokyo Tyrant has been in development for many years and is used in production by Plurk.com, mixi.jp and scribd.com (to name a few)...
Great performance (comparable to memcached!)
Can store millions of keys on very few servers - tested in production
Scale out by just adding nodes
Nodes are replicated via master-master replication. Automatic failover and load balancing is supported from the start
Ability to script and extend using Lua. Included extensions are incr and a fixed list
Hot backups and restore: Take backups and restore servers without shutting them down
LightCloud manager can control nodes, take backups and give you a status on how your nodes are doing
Very small foot print (lightcloud client is around ~500 lines and manager about ~400)
Python only, but LightCloud should be easy to port to other languages.
Ruby port under development!
But that's not all, we also support Redis (as an alternative to Tokyo Tyrant)!:
Check benchmarks and more details about Redis in LightCloud adds support for Redis.
Stability
It's production ready and Plurk.com is using it to store millions of keys on only two servers that run 3 lookup nodes and 6 storage nodes (these servers also run MySQL).
How LightCloud differs from memcached and MySQL?
memcached is used for caching, meaning that after some time items saved to memcached are deleted. LightCloud is persistent, meaning that once you save an item, it will be there forever (or until you delete/update it).
MySQL and other relational databases are not efficient for storing key-value pairs, a key-value database like LightCloud is. And you can also extend LightCloud with an inverted index that could be used for efficient search and other "special databases" that fit your domain.
The bottom line is that LightCloud is not a replacement for memcached or MySQL - it's a complement that can be used in situations where your data does not fit that well into the relational model.
How LightCloud differs from redis and memcachedb?
LightCloud is a distributed and horizontal scaleable database. memcachedb or redis aren't. This is pretty crucial to understand and we can read that many have not really understood this.
Basically, LightCloud could be built on top of memcachedb or redis - where the nodes would be memcachedb or redis instead of Tokyo Tyrant. The reason why Tokyo Tyrant was chosen is because it's the fastest key-value database around with the ability to do 1 million SETs and GETs under 1 second (see benchmark).
Benchmark against memcached
Please do note that comparing to memcached is unfair as memcached is memory only - LightCloud has to hit the disk. That said, here is what it takes to do 10.000 gets and sets:
Elapsed for 10000 gets: 1.74538516998 seconds [memcache]
Elapsed for 10000 gets: 3.57339096069 seconds [lightcloud]
Elapsed for 10000 sets: 1.88236999512 seconds [memcache]
Elapsed for 10000 sets: 9.23674893379 seconds [lightcloud]
Benchmark program
If things were done in batches and time wasn't spent in Python and network layer, then Tokyo Tyrant would be able to perform much better. From the official Tokyo Cabinet benchmark you can see following stats:
1 million GETS in < 0.5 seconds
1 million SETS in < 0.5 seconds
These updates are not that realistic in practice and therefor we compare LightCloud to memcached.
[/size]