Q:你们redis怎么做的分布式
A:我们公司redis用的murmurHash做的分片;
Q:讲讲murmurHash的原理呗
A:额……这块没有深入了解过(真TM掉分)
哈希算法简单来说就是将一个元素映射成另一个元素,可以简单分类两类,
这里说说Jedis中的Shard是如何使用一致性hash的
首先是hash函数,在Jedis中有两种Hash算法可供选择,分别是MurMurHash和MD5. 按照Jedis的说法MurmurHash更快,效果更好些。
package redis.clients.util;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
public interface Hashing {
public static final Hashing MURMUR_HASH = new MurmurHash();
public ThreadLocal<MessageDigest> md5Holder = new ThreadLocal<MessageDigest>();
public static final Hashing MD5 = new Hashing() {
public long hash(String key) {
return hash(SafeEncoder.encode(key));
}
public long hash(byte[] key) {
try {
if (md5Holder.get() == null) {
md5Holder.set(MessageDigest.getInstance("MD5"));
}
} catch (NoSuchAlgorithmException e) {
throw new IllegalStateException("++++ no md5 algorythm found");
}
MessageDigest md5 = md5Holder.get();
md5.reset();
md5.update(key);
byte[] bKey = md5.digest();
long res = ((long) (bKey[3] & 0xFF) << 24)
| ((long) (bKey[2] & 0xFF) << 16)
| ((long) (bKey[1] & 0xFF) << 8) | (long) (bKey[0] & 0xFF);
return res;
}
};
public long hash(String key);
public long hash(byte[] key);
}
涉及一些nio的buffer操作,这里不说了
package redis.clients.util;
import com.google.common.hash.Hashing;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
public class MurmurHash implements Hashing {
public static int hash(byte[] data, int seed) {
return hash(ByteBuffer.wrap(data), seed);
}
public static int hash(byte[] data, int offset, int length, int seed) {
return hash(ByteBuffer.wrap(data, offset, length), seed);
}
public static int hash(ByteBuffer buf, int seed) {
// save byte order for later restoration
ByteOrder byteOrder = buf.order();
buf.order(ByteOrder.LITTLE_ENDIAN);
int m = 0x5bd1e995;
int r = 24;
int h = seed ^ buf.remaining();
int k;
while (buf.remaining() >= 4) {
k = buf.getInt();
k *= m;
k ^= k >>> r;
k *= m;
h *= m;
h ^= k;
}
if (buf.remaining() > 0) {
ByteBuffer finish = ByteBuffer.allocate(4).order(
ByteOrder.LITTLE_ENDIAN);
// for big-endian version, use this first:
// finish.position(4-buf.remaining());
finish.put(buf).rewind();
h ^= finish.getInt();
h *= m;
}
h ^= h >>> 13;
h *= m;
h ^= h >>> 15;
buf.order(byteOrder);
return h;
}
public static long hash64A(byte[] data, int seed) {
return hash64A(ByteBuffer.wrap(data), seed);
}
public static long hash64A(byte[] data, int offset, int length, int seed) {
return hash64A(ByteBuffer.wrap(data, offset, length), seed);
}
public static long hash64A(ByteBuffer buf, int seed) {
ByteOrder byteOrder = buf.order();
buf.order(ByteOrder.LITTLE_ENDIAN);
long m = 0xc6a4a7935bd1e995L;
int r = 47;
long h = seed ^ (buf.remaining() * m);
long k;
while (buf.remaining() >= 8) {
k = buf.getLong();
k *= m;
k ^= k >>> r;
k *= m;
h ^= k;
h *= m;
}
if (buf.remaining() > 0) {
ByteBuffer finish = ByteBuffer.allocate(8).order(
ByteOrder.LITTLE_ENDIAN);
// for big-endian version, do this first:
// finish.position(8-buf.remaining());
finish.put(buf).rewind();
h ^= finish.getLong();
h *= m;
}
h ^= h >>> r;
h *= m;
h ^= h >>> r;
buf.order(byteOrder);
return h;
}
public long hash(byte[] key) {
return hash64A(key, 0x1234ABCD);
}
public long hash(String key) {
return hash(SafeEncoder.encode(key));
}
}
我们公司自己做的封装就不贴code了,
看一下Jedis自己的一致性Hash:
public S getShardInfo(byte[] key) {
SortedMap<Long, S> tail = nodes.tailMap(algo.hash(key));
if (tail.isEmpty()) {
return nodes.get(nodes.firstKey());
}
return tail.get(tail.firstKey());
}
比较经典的一致性hash实现方案了,这里就不深入展开了。
说一下algo.hash(key)
这一步murmurHash为啥Hash碰撞少,性能快吧。
步骤1:
public long hash(byte[] key) {
return hash64A(key, 0x1234ABCD);
}
步骤2:
public static long hash64A(byte[] data, int seed) {
return hash64A(ByteBuffer.wrap(data), seed);
}
步骤3:
public static long hash64A(ByteBuffer buf, int seed) {
ByteOrder byteOrder = buf.order();
buf.order(ByteOrder.LITTLE_ENDIAN);
long m = 0xc6a4a7935bd1e995L;
int r = 47;
long h = seed ^ (buf.remaining() * m);
long k;
while (buf.remaining() >= 8) {
k = buf.getLong();
k *= m;
k ^= k >>> r;
k *= m;
h ^= k;
h *= m;
}
if (buf.remaining() > 0) {
ByteBuffer finish = ByteBuffer.allocate(8).order(
ByteOrder.LITTLE_ENDIAN);
// for big-endian version, do this first:
// finish.position(8-buf.remaining());
finish.put(buf).rewind();
h ^= finish.getLong();
h *= m;
}
h ^= h >>> r;
h *= m;
h ^= h >>> r;
buf.order(byteOrder);
return h;
}
hash64A(key, 0x1234ABCD);
long m = 0xc6a4a7935bd1e995L;
int r = 47;
murmurHash这个三个常量听说是Austin Appleby用大量测试数据肝出来的……
性能快应该是与大量使用位操作有关
里面的变化逻辑,自己比划吧,不一行行解释了;我表示看了跟没看一样……大神的世界,学都没法学,害……
MurmurHash算法,自称超级快的hash算法,是FNV的4-5倍。官方数据如下:
OneAtATime – 354.163715 mb/sec
FNV – 443.668038 mb/sec
SuperFastHash – 985.335173 mb/sec
lookup3 – 988.080652 mb/sec
MurmurHash 1.0 – 1363.293480 mb/sec
MurmurHash 2.0 – 2056.885653 mb/sec