当前位置: 首页 > 工具软件 > openLooKeng > 使用案例 >

openLooKeng配置Hbase连接器

端木宏才
2023-12-01

openLooKeng配置Hbase连接器

openLooKeng给大家的直观感觉是包了一层Presto(特别是看了源码的工程后,自己曾经也有这样的想法),其实不是这样的(随着研究的深入,发现不是简单的包了下Presto,还是有很多自己的东西的,点个赞给openLooKeng。 当然现在还是个小白,昨天很开心的发现了一个bug,自己修完后,发现12天前已经被fix了,哭~~),openLooKeng还是做了很多的改进的,比如多数据中心的概念,Hbase的connector,今天主要介绍下openLooKeng Hbase connector的简单配置,并和hive数据源做简单的笛卡尔积。

  1. 默认你已经熟悉了openLooKeng的安装,如果不熟悉请参看我的链接: 手动安装openLooKeng.

  2. openLooKeng官网上现在能下载最新的版本是1.0.1,这个版本hbase connector 有很多bug,不能用,可以自己编译打包1.1.0版本,或者等后续1.1.0正式发布。这里我们自行编译打包1.1.0版本。下载源码请到openLooKeng源码. 虽然我更喜欢在github上,一是由于网速简直太慢,二是中国人嘛,当然支持咱们自己的东西啦。clone 下的源码直接

mvn install -DskipTests

就好。如果是下载下来的源码,编译时会遇到

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.497 s
[INFO] Finished at: 2020-12-09T13:40:27+08:00
[INFO] Final Memory: 111M/1222M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal pl.project13.maven:git-commit-id-plugin:3.0.1:revision (default) on project hetu-common: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.

通过修改pom,添加如下配置,skip掉 git-commit-id-plugin

...
<plugins>
            <!-- new added -->
            <plugin>
                <groupId>pl.project13.maven</groupId>
                <artifactId>git-commit-id-plugin</artifactId>
                <configuration>
                    <skip>true</skip>
                </configuration>
            </plugin>
...            

然后进行编译:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] presto-root ........................................ SUCCESS [  5.333 s]
[INFO] hetu-common ........................................ SUCCESS [ 10.612 s]
[INFO] presto-spi ......................................... SUCCESS [ 54.071 s]
[INFO] presto-plugin-toolkit .............................. SUCCESS [  3.930 s]
[INFO] hetu-transport ..................................... SUCCESS [  1.860 s]
[INFO] presto-client ...................................... SUCCESS [  4.068 s]
[INFO] presto-parser ...................................... SUCCESS [ 33.219 s]
[INFO] presto-geospatial-toolkit .......................... SUCCESS [  4.980 s]
[INFO] presto-array ....................................... SUCCESS [  1.825 s]
[INFO] presto-matching .................................... SUCCESS [  0.662 s]
[INFO] presto-memory-context .............................. SUCCESS [  0.468 s]
[INFO] presto-tpch ........................................ SUCCESS [  6.006 s]
[INFO] hetu-hazelcast ..................................... SUCCESS [ 56.355 s]
[INFO] presto-testing-docker .............................. SUCCESS [ 28.761 s]
[INFO] hetu-filesystem-client ............................. SUCCESS [01:01 min]
[INFO] hetu-seed-store .................................... SUCCESS [  0.933 s]
[INFO] hetu-state-store ................................... SUCCESS [  8.397 s]
[INFO] presto-main ........................................ SUCCESS [01:45 min]
[INFO] presto-resource-group-managers ..................... SUCCESS [  9.006 s]
[INFO] presto-tests ....................................... SUCCESS [  7.606 s]
[INFO] presto-atop ........................................ SUCCESS [  4.112 s]
[INFO] presto-jmx ......................................... SUCCESS [  2.395 s]
[INFO] presto-record-decoder .............................. SUCCESS [  5.134 s]
[INFO] presto-kafka ....................................... SUCCESS [ 23.618 s]
[INFO] presto-memory ...................................... SUCCESS [  2.551 s]
[INFO] presto-orc ......................................... SUCCESS [01:31 min]
[INFO] presto-benchmark ................................... SUCCESS [  3.630 s]
[INFO] presto-parquet ..................................... SUCCESS [  4.428 s]
[INFO] presto-rcfile ...................................... SUCCESS [  3.261 s]
[INFO] presto-hive ........................................ SUCCESS [ 27.049 s]
[INFO] presto-hive-hadoop2 ................................ SUCCESS [  9.097 s]
[INFO] presto-teradata-functions .......................... SUCCESS [  1.787 s]
[INFO] presto-example-http ................................ SUCCESS [  2.050 s]
[INFO] presto-local-file .................................. SUCCESS [  2.168 s]
[INFO] presto-tpcds ....................................... SUCCESS [  4.071 s]
[INFO] presto-base-jdbc ................................... SUCCESS [  3.424 s]
[INFO] presto-mysql ....................................... SUCCESS [01:08 min]
[INFO] presto-postgresql .................................. SUCCESS [ 45.495 s]
[INFO] presto-sqlserver ................................... SUCCESS [  4.601 s]
[INFO] presto-ml .......................................... SUCCESS [  4.080 s]
[INFO] presto-geospatial .................................. SUCCESS [  5.790 s]
[INFO] hetu-jdbc .......................................... SUCCESS [ 26.563 s]
[INFO] hetu-cli ........................................... SUCCESS [ 11.853 s]
[INFO] presto-product-tests ............................... SUCCESS [01:03 min]
[INFO] presto-benchmark-driver ............................ SUCCESS [  4.697 s]
[INFO] presto-verifier .................................... SUCCESS [  4.410 s]
[INFO] presto-testing-server-launcher ..................... SUCCESS [ 14.372 s]
[INFO] presto-password-authenticators ..................... SUCCESS [  1.025 s]
[INFO] presto-session-property-managers ................... SUCCESS [  1.954 s]
[INFO] presto-benchto-benchmarks .......................... SUCCESS [ 12.452 s]
[INFO] presto-thrift-api .................................. SUCCESS [  3.852 s]
[INFO] presto-thrift-testing-server ....................... SUCCESS [ 14.957 s]
[INFO] presto-thrift ...................................... SUCCESS [ 13.404 s]
[INFO] presto-proxy ....................................... SUCCESS [  3.337 s]
[INFO] presto-elasticsearch ............................... SUCCESS [ 35.763 s]
[INFO] hetu-hive-functions ................................ SUCCESS [ 11.109 s]
[INFO] hetu-oracle ........................................ SUCCESS [  8.868 s]
[INFO] hetu-metastore ..................................... SUCCESS [  2.050 s]
[INFO] hetu-vdm ........................................... SUCCESS [  2.382 s]
[INFO] hetu-heuristic-index ............................... SUCCESS [01:43 min]
[INFO] hetu-datacenter .................................... SUCCESS [  3.333 s]
[INFO] hetu-hana .......................................... SUCCESS [  2.917 s]
[INFO] hetu-listener ...................................... SUCCESS [  1.919 s]
[INFO] hetu-hbase ......................................... SUCCESS [ 41.148 s]
[INFO] hetu-carbondata .................................... SUCCESS [02:49 min]
[INFO] hetu-sql-migration-tool ............................ SUCCESS [  6.977 s]
[INFO] hetu-server ........................................ SUCCESS [01:30 min]
[INFO] hetu-server-rpm .................................... SUCCESS [01:14 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24:15 min
[INFO] Finished at: 2020-12-09T12:02:51+08:00
[INFO] Final Memory: 554M/2917M
[INFO] ------------------------------------------------------------------------
  1. 解压openLooKeng安装包,catalog的配置 etc/catalog/hbase197.properties 具体参数配置可阅读官网,这里不多解释
[root@ocdp198 catalog]# cat hbase197.properties
connector.name=hbase-connector
hbase.zookeeper.quorum=10.1.235.197,10.1.235.198,10.1.235.199
hbase.zookeeper.property.clientPort=2181
hbase.zookeeper.znode.parent=/hbase-unsecure
hbase.metastore.type=hetuMetastore
  1. 使用openLooKeng元存储来存储HBase元数据 etc/hetu-metastore.properties 具体参数配置可阅读官网,这里不多解释。这里我们用mysql做存储,官网上介绍还可以用HDFS作为元数据存储(我未进行尝试)。
[root@ocdp198 etc]# cat hetu-metastore.properties
hetu.metastore.type=jdbc
hetu.metastore.db.url=jdbc:mysql://dp200:3306/hetum101236198
hetu.metastore.db.user=***
hetu.metastore.db.password=***
  1. 启动openLooKeng,并登录cli, hbase catalog已经加载
lk> show catalogs;
 Catalog
----------
 hbase197
 hive
 system
(3 rows)
  1. 对于HBase连接器支持两种建表形式:1. 创建表并直接链接到HBase数据源中已存在的表。2. 创建HBase数据源中不存在的新表。

openLooKeng 创建新表

lk> CREATE TABLE hbase197.default.member3 (
 ->     rowId       VARCHAR,
 ->     age1    INTEGER
 -> );
CREATE TABLE
lk> insert into hbase197.default.member3 values('11111',234567);
INSERT: 1 row

Query 20201209_060113_00009_wevd7, FINISHED, 1 node
Splits: 35 total, 35 done (100.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]

lk> select * from hbase197.default.member3;
 rowid |  age1
-------+--------
 11111 | 234567
(1 row)

Query 20201209_060119_00010_wevd7, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [1 rows, 11B] [4 rows/s, 48B/s]

Hbase中查看新表:

[ocdp@dp197 root]$ hbase shell
hbase(main):001:0>
hbase(main):002:0* list
TABLE
member
member1
member2
member3
4 row(s)
Took 0.5402 seconds
=> ["member", "member1", "member2", "member3"]
hbase(main):003:0> scan 'member3'
ROW                                                  COLUMN+CELL
 11111                                               column=family:age1, timestamp=1607493336908, value=234567
1 row(s)
Took 0.2353 seconds
hbase(main):004:0>

openLooKeng 关联Hbase已经存在的表
Hbase shell 创建表

hbase(main):009:0> create 'member7','info'
Created table member7
Took 1.3789 seconds
=> Hbase::Table - member7
hbase(main):010:0> put 'member7','12345678','info:age','27'
Took 0.0859 seconds
hbase(main):011:0> scan 'member7'
ROW                                                  COLUMN+CELL
 12345678                                            column=info:age, timestamp=1607493983862, value=27
1 row(s)
Took 0.0103 seconds
hbase(main):012:0>

openLooKeng关联映射已存在表

lk> CREATE TABLE hbase197.default.member7 (
 ->     rowId       VARCHAR,
 ->     age1    INTEGER
 -> )
 -> WITH (
 ->     column_mapping = 'age1:info:age',
 ->     row_id = 'rowId',
 ->     hbase_table_name = 'default:member7'
 -> );
CREATE TABLE
lk> select * from hbase197.default.member7;
  rowid   | age1
----------+------
 12345678 |   27
(1 row)

Query 20201209_061349_00025_wevd7, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [1 rows, 10B] [4 rows/s, 44B/s]

映射关系的格式为:‘column_name : family : qualifier’ 具体细节请参考官网

CREATE TABLE schemaName.tableName (
    rowId		VARCHAR,
    qualifier1	TINYINT,
    qualifier2	SMALLINT,
    qualifier3	INTEGER,
    qualifier4	BIGINT,
    qualifier5	DOUBLE,
    qualifier6	BOOLEAN,
    qualifier7	TIME,
    qualifier8	DATE,
    qualifier9	TIMESTAMP
)
WITH (
    column_mapping = 'qualifier1:f1:q1, qualifier2:f1:q2, 
    qualifier3:f2:q3, qualifier4:f2:q4, qualifier5:f2:q5, qualifier6:f3:q1, 
    qualifier7:f3:q2, qualifier8:f3:q3, qualifier9:f3:q4',
    row_id = 'rowId',
    hbase_table_name = 'hbaseNamespace:hbaseTable',
    external = false
);
  1. 最后做一个hbae和hive的笛卡尔积吧
lk> select * from hbase197.default.member7, hive.prestotest.score ;
  rowid   | age1 | sno |  cno  | degree
----------+------+-----+-------+--------
 12345678 |   27 | 105 | 3-105 |     88
 12345678 |   27 | 101 | 6-166 |     85
 12345678 |   27 | 103 | 3-105 |     92
 12345678 |   27 | 105 | 3-245 |     75
 12345678 |   27 | 107 | 6-166 |     79
 12345678 |   27 | 109 | 3-245 |     68
 12345678 |   27 | 109 | 3-105 |     76
 12345678 |   27 | 107 | 3-105 |     91
 12345678 |   27 | 108 | 3-105 |     78
 12345678 |   27 | 103 | 3-245 |     86
 12345678 |   27 | 101 | 3-105 |     64
 12345678 |   27 | 108 | 6-166 |     81
(12 rows)

Query 20201209_061624_00032_wevd7, FINISHED, 1 node
Splits: 46 total, 46 done (100.00%)
0:07 [25 rows, 14.7KB] [3 rows/s, 2.17KB/s]
 类似资料: