当前位置: 首页 > 工具软件 > snappy-start > 使用案例 >

hadoop,hbase添加snappy压缩算法

孙京
2023-12-01

参考博文:
http://blog.cheyo.net/197.html
https://blog.csdn.net/bbaiggey/article/details/53583923
https://github.com/google/snappy/releases

一,环境准备

前提条件:hadoop-3.1.2, hbase-2.0.5, maven-3, jdk1.8
1,安装maven3,并配置环境变量
2,使用mvn -v 查看是否安装成功

root@xiaoma:~# javac -version
javac 1.8.0_141

xiaoma@xiaoma:~$ mvn -v
Apache Maven 3.6.2

root@xiaoma:~# hadoop version
Hadoop 3.1.2

root@xiaoma:~# hbase version
HBase 2.0.5

2,测试机环境变量:

export JAVA_HOME=/usr/lib/jdk1.8.0_141 
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:${JRE_HOME}/lib:$PATH

export MAVEN_HOME=/home/xiaoma/apache-maven-3.6.2
export PATH=$PATH:$MAVEN_HOME/bin

export HADOOP_HOME=/home/xiaoma/hadoop-3.1.2
export HBASE_HOME=/home/xiaoma/hbase-2.0.5

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin

二,安装gcc,autoconf,automake,libtool

可尝试先使用命令

root@xiaoma:/ apt-get update

安装gcc,centos使用 yum install gcc

root@xiaoma:/ apt-get install build-essential

安装autoconf,automake,libtool,

分别执行以下命令
root@xiaoma:/# apt-get install autoconf
root@xiaoma:/# apt-get install automake
root@xiaoma:/# apt-get install libtool

三, 安装Snappy库

hadoop本身是并不支持snappy格式的压缩的(可以通过hadoop checknative指令查看),默认为false,(后续也可以用来检测是否安装成功)
下载地址(1.1.1)
https://src.fedoraproject.org/repo/pkgs/snappy/snappy-1.1.1.tar.gz/8887e3b7253b22a31f5486bca3cbc1c2/snappy-1.1.1.tar.gz
1,下载解压

root@xiaoma:/home/xiaoma tar -zxvf /home/xiaoma/snappy-1.1.1.tar.gz

2,进入解压后的文件夹,进行编译,链接到本地库
分别执行命令

root@xiaoma:/home/xiaoma/snappy-1.1.1# ./configure

root@xiaoma:/home/xiaoma/snappy-1.1.1# make

root@xiaoma:/home/xiaoma/snappy-1.1.1# make install

默认安装到了/usr/local/lib, 这时在此目录下查看会生成:

root@xiaoma:/home/xiaoma/snappy-1.1.1# cd /usr/local/lib
root@xiaoma:/usr/local/lib# ll
total 612
drwxr-xr-x  3 root root    4096 Nov 13 09:48 ./
drwxr-xr-x 10 root root    4096 Feb 14  2019 ../
-rw-r--r--  1 root root  410610 Nov 13 09:48 libsnappy.a
-rwxr-xr-x  1 root root     953 Nov 13 09:48 libsnappy.la*
lrwxrwxrwx  1 root root      18 Nov 13 09:48 libsnappy.so -> libsnappy.so.1.2.0*
lrwxrwxrwx  1 root root      18 Nov 13 09:48 libsnappy.so.1 -> libsnappy.so.1.2.0*
-rwxr-xr-x  1 root root  196264 Nov 13 09:48 libsnappy.so.1.2.0*
drwxrwsr-x  3 root staff   4096 Feb 14  2019 python3.6/

如果没有错误且文件及链接一致,snappy的安装已经成功

四,hadoop-snappy的源码编译过程

1,下载源码
https://codeload.github.com/electrum/hadoop-snappy/zip/master

2,执行解压,可用apt-get install unzip安装

root@xiaoma:/home/xiaoma# unzip hadoop-snappy-master.zip

3,进入解压后的文件夹

root@xiaoma:/home/xiaoma# cd hadoop-snappy-master

4,编译,

root@xiaoma:/home/xiaoma/hadoop-snappy-master# mvn package

等待几分钟后完成

5,拷贝hadoop-snappy库
编译完成后,自动产生target文件夹,
进入子目录

root@xiaoma:/home/xiaoma/hadoop-snappy-master# 
cd /home/xiaoma/hadoop-snappy-master/target/hadoop-snappy-0.0.1-SNAPSHOT-tar/hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64

拷贝其中的jar和相关文件到hadoop的/lib/native下

root@xiaoma:/home/xiaoma/hadoop-snappy-master/target/hadoop-snappy-0.0.1-SNAPSHOT-tar/hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64# 
使用命令
cp lib* /home/xiaoma/hadoop-3.1.2/lib/native     (按环境变量路径配置)
root@xiaoma:/# 
cd /home/xiaoma/hadoop-snappy-master/target/hadoop-snappy-0.0.1-SNAPSHOT-tar/hadoop-snappy-0.0.1-SNAPSHOT/lib
cp hadoop-snappy-0.0.1-SNAPSHOT.jar /home/xiaoma/hadoop-3.1.2/lib/native   (按环境变量路径配置)

6,检测是否编译成功

root@xiaoma:~# hadoop checknative
可以看到已经变为true
snappy:  true /home/xiaoma/hadoop-3.1.2/lib/native/libsnappy.so.1

五,hadoop添加snappy的相关配置

root@xiaoma:~# cd /home/xiaoma/hadoop-3.1.2/etc/hadoop

1,hadoop-env.sh中增加

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/:/usr/local/lib/

2,core-site.xml文件中加入snappy配置:

root@xiaoma:/home/xiaoma/hadoop-3.1.2/etc/hadoop# vim core-site.xml
<property>
    <name>io.compression.codecs</name>
    <value>
      org.apache.hadoop.io.compress.GzipCodec,
      org.apache.hadoop.io.compress.DefaultCodec,
      org.apache.hadoop.io.compress.BZip2Codec,
      org.apache.hadoop.io.compress.SnappyCodec
    </value>
</property>

六,hbase添加snappy的相关配置(与hadoop大致相同)

1,进入target文件夹

root@xiaoma:/# cd /home/xiaoma/hadoop-snappy-master/target

2,拷贝jar包,(按环境变量路径配置)

root@xiaoma:/home/xiaoma/hadoop-snappy-master/target# 
cp -r hadoop-snappy-0.0.1-SNAPSHOT-tar/hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64 /home/xiaoma/hbase-2.0.5/lib/native

3,配置HBase环境变量hbase-env.sh

root@xiaoma:/# 
	cd /home/xiaoma/hbase-2.0.5/conf
root@xiaoma:/home/xiaoma/hbase-2.0.5/conf# 
	vim hbase-env.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/:/usr/local/lib/
export HBASE_LIBRARY_PATH=$HBASE_LIBRARY_PATH:$HBASE_HOME/lib/native/:/usr/local/lib/

七,检测配置

1,重新启动

root@xiaoma:/# stop-dfs.sh
root@xiaoma:/# stop-hbase.sh
root@xiaoma:/# start-dfs.sh
root@xiaoma:/# start-hbase.sh
root@xiaoma:/# jps
26994 HMaster
25830 DataNode
27387 Jps
26124 SecondaryNameNode

2,检测

root@xiaoma:/# hbase shell
hbase(main):006:0> create 'test', { NAME => 'f', COMPRESSION => 'snappy'}
Created table test
Took 0.7462 seconds                                                                                                            
=> Hbase::Table - test

hbase(main):009:0* describe "test"
Table test is ENABLED                                                                                                                       
test                                                                                                                                        
COLUMN FAMILIES DESCRIPTION                                                                                                                 
{NAME => 'f', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_
ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', C
ACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => '
SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                                        
1 row(s)

出现参数compression,完成

 类似资料: