openLooKeng给大家的直观感觉是包了一层Presto(特别是看了源码的工程后,自己曾经也有这样的想法),其实不是这样的(随着研究的深入,发现不是简单的包了下Presto,还是有很多自己的东西的,点个赞给openLooKeng。 当然现在还是个小白,昨天很开心的发现了一个bug,自己修完后,发现12天前已经被fix了,哭~~),openLooKeng还是做了很多的改进的,比如多数据中心的概念,Hbase的connector,今天主要介绍下openLooKeng Hbase connector的简单配置,并和hive数据源做简单的笛卡尔积。
默认你已经熟悉了openLooKeng的安装,如果不熟悉请参看我的链接: 手动安装openLooKeng.
openLooKeng官网上现在能下载最新的版本是1.0.1,这个版本hbase connector 有很多bug,不能用,可以自己编译打包1.1.0版本,或者等后续1.1.0正式发布。这里我们自行编译打包1.1.0版本。下载源码请到openLooKeng源码. 虽然我更喜欢在github上,一是由于网速简直太慢,二是中国人嘛,当然支持咱们自己的东西啦。clone 下的源码直接
mvn install -DskipTests
就好。如果是下载下来的源码,编译时会遇到
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.497 s
[INFO] Finished at: 2020-12-09T13:40:27+08:00
[INFO] Final Memory: 111M/1222M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal pl.project13.maven:git-commit-id-plugin:3.0.1:revision (default) on project hetu-common: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
通过修改pom,添加如下配置,skip掉 git-commit-id-plugin
...
<plugins>
<!-- new added -->
<plugin>
<groupId>pl.project13.maven</groupId>
<artifactId>git-commit-id-plugin</artifactId>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
...
然后进行编译:
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] presto-root ........................................ SUCCESS [ 5.333 s]
[INFO] hetu-common ........................................ SUCCESS [ 10.612 s]
[INFO] presto-spi ......................................... SUCCESS [ 54.071 s]
[INFO] presto-plugin-toolkit .............................. SUCCESS [ 3.930 s]
[INFO] hetu-transport ..................................... SUCCESS [ 1.860 s]
[INFO] presto-client ...................................... SUCCESS [ 4.068 s]
[INFO] presto-parser ...................................... SUCCESS [ 33.219 s]
[INFO] presto-geospatial-toolkit .......................... SUCCESS [ 4.980 s]
[INFO] presto-array ....................................... SUCCESS [ 1.825 s]
[INFO] presto-matching .................................... SUCCESS [ 0.662 s]
[INFO] presto-memory-context .............................. SUCCESS [ 0.468 s]
[INFO] presto-tpch ........................................ SUCCESS [ 6.006 s]
[INFO] hetu-hazelcast ..................................... SUCCESS [ 56.355 s]
[INFO] presto-testing-docker .............................. SUCCESS [ 28.761 s]
[INFO] hetu-filesystem-client ............................. SUCCESS [01:01 min]
[INFO] hetu-seed-store .................................... SUCCESS [ 0.933 s]
[INFO] hetu-state-store ................................... SUCCESS [ 8.397 s]
[INFO] presto-main ........................................ SUCCESS [01:45 min]
[INFO] presto-resource-group-managers ..................... SUCCESS [ 9.006 s]
[INFO] presto-tests ....................................... SUCCESS [ 7.606 s]
[INFO] presto-atop ........................................ SUCCESS [ 4.112 s]
[INFO] presto-jmx ......................................... SUCCESS [ 2.395 s]
[INFO] presto-record-decoder .............................. SUCCESS [ 5.134 s]
[INFO] presto-kafka ....................................... SUCCESS [ 23.618 s]
[INFO] presto-memory ...................................... SUCCESS [ 2.551 s]
[INFO] presto-orc ......................................... SUCCESS [01:31 min]
[INFO] presto-benchmark ................................... SUCCESS [ 3.630 s]
[INFO] presto-parquet ..................................... SUCCESS [ 4.428 s]
[INFO] presto-rcfile ...................................... SUCCESS [ 3.261 s]
[INFO] presto-hive ........................................ SUCCESS [ 27.049 s]
[INFO] presto-hive-hadoop2 ................................ SUCCESS [ 9.097 s]
[INFO] presto-teradata-functions .......................... SUCCESS [ 1.787 s]
[INFO] presto-example-http ................................ SUCCESS [ 2.050 s]
[INFO] presto-local-file .................................. SUCCESS [ 2.168 s]
[INFO] presto-tpcds ....................................... SUCCESS [ 4.071 s]
[INFO] presto-base-jdbc ................................... SUCCESS [ 3.424 s]
[INFO] presto-mysql ....................................... SUCCESS [01:08 min]
[INFO] presto-postgresql .................................. SUCCESS [ 45.495 s]
[INFO] presto-sqlserver ................................... SUCCESS [ 4.601 s]
[INFO] presto-ml .......................................... SUCCESS [ 4.080 s]
[INFO] presto-geospatial .................................. SUCCESS [ 5.790 s]
[INFO] hetu-jdbc .......................................... SUCCESS [ 26.563 s]
[INFO] hetu-cli ........................................... SUCCESS [ 11.853 s]
[INFO] presto-product-tests ............................... SUCCESS [01:03 min]
[INFO] presto-benchmark-driver ............................ SUCCESS [ 4.697 s]
[INFO] presto-verifier .................................... SUCCESS [ 4.410 s]
[INFO] presto-testing-server-launcher ..................... SUCCESS [ 14.372 s]
[INFO] presto-password-authenticators ..................... SUCCESS [ 1.025 s]
[INFO] presto-session-property-managers ................... SUCCESS [ 1.954 s]
[INFO] presto-benchto-benchmarks .......................... SUCCESS [ 12.452 s]
[INFO] presto-thrift-api .................................. SUCCESS [ 3.852 s]
[INFO] presto-thrift-testing-server ....................... SUCCESS [ 14.957 s]
[INFO] presto-thrift ...................................... SUCCESS [ 13.404 s]
[INFO] presto-proxy ....................................... SUCCESS [ 3.337 s]
[INFO] presto-elasticsearch ............................... SUCCESS [ 35.763 s]
[INFO] hetu-hive-functions ................................ SUCCESS [ 11.109 s]
[INFO] hetu-oracle ........................................ SUCCESS [ 8.868 s]
[INFO] hetu-metastore ..................................... SUCCESS [ 2.050 s]
[INFO] hetu-vdm ........................................... SUCCESS [ 2.382 s]
[INFO] hetu-heuristic-index ............................... SUCCESS [01:43 min]
[INFO] hetu-datacenter .................................... SUCCESS [ 3.333 s]
[INFO] hetu-hana .......................................... SUCCESS [ 2.917 s]
[INFO] hetu-listener ...................................... SUCCESS [ 1.919 s]
[INFO] hetu-hbase ......................................... SUCCESS [ 41.148 s]
[INFO] hetu-carbondata .................................... SUCCESS [02:49 min]
[INFO] hetu-sql-migration-tool ............................ SUCCESS [ 6.977 s]
[INFO] hetu-server ........................................ SUCCESS [01:30 min]
[INFO] hetu-server-rpm .................................... SUCCESS [01:14 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24:15 min
[INFO] Finished at: 2020-12-09T12:02:51+08:00
[INFO] Final Memory: 554M/2917M
[INFO] ------------------------------------------------------------------------
[root@ocdp198 catalog]# cat hbase197.properties
connector.name=hbase-connector
hbase.zookeeper.quorum=10.1.235.197,10.1.235.198,10.1.235.199
hbase.zookeeper.property.clientPort=2181
hbase.zookeeper.znode.parent=/hbase-unsecure
hbase.metastore.type=hetuMetastore
[root@ocdp198 etc]# cat hetu-metastore.properties
hetu.metastore.type=jdbc
hetu.metastore.db.url=jdbc:mysql://dp200:3306/hetum101236198
hetu.metastore.db.user=***
hetu.metastore.db.password=***
lk> show catalogs;
Catalog
----------
hbase197
hive
system
(3 rows)
openLooKeng 创建新表
lk> CREATE TABLE hbase197.default.member3 (
-> rowId VARCHAR,
-> age1 INTEGER
-> );
CREATE TABLE
lk> insert into hbase197.default.member3 values('11111',234567);
INSERT: 1 row
Query 20201209_060113_00009_wevd7, FINISHED, 1 node
Splits: 35 total, 35 done (100.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]
lk> select * from hbase197.default.member3;
rowid | age1
-------+--------
11111 | 234567
(1 row)
Query 20201209_060119_00010_wevd7, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [1 rows, 11B] [4 rows/s, 48B/s]
Hbase中查看新表:
[ocdp@dp197 root]$ hbase shell
hbase(main):001:0>
hbase(main):002:0* list
TABLE
member
member1
member2
member3
4 row(s)
Took 0.5402 seconds
=> ["member", "member1", "member2", "member3"]
hbase(main):003:0> scan 'member3'
ROW COLUMN+CELL
11111 column=family:age1, timestamp=1607493336908, value=234567
1 row(s)
Took 0.2353 seconds
hbase(main):004:0>
openLooKeng 关联Hbase已经存在的表
Hbase shell 创建表
hbase(main):009:0> create 'member7','info'
Created table member7
Took 1.3789 seconds
=> Hbase::Table - member7
hbase(main):010:0> put 'member7','12345678','info:age','27'
Took 0.0859 seconds
hbase(main):011:0> scan 'member7'
ROW COLUMN+CELL
12345678 column=info:age, timestamp=1607493983862, value=27
1 row(s)
Took 0.0103 seconds
hbase(main):012:0>
openLooKeng关联映射已存在表
lk> CREATE TABLE hbase197.default.member7 (
-> rowId VARCHAR,
-> age1 INTEGER
-> )
-> WITH (
-> column_mapping = 'age1:info:age',
-> row_id = 'rowId',
-> hbase_table_name = 'default:member7'
-> );
CREATE TABLE
lk> select * from hbase197.default.member7;
rowid | age1
----------+------
12345678 | 27
(1 row)
Query 20201209_061349_00025_wevd7, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [1 rows, 10B] [4 rows/s, 44B/s]
映射关系的格式为:‘column_name : family : qualifier’ 具体细节请参考官网
CREATE TABLE schemaName.tableName (
rowId VARCHAR,
qualifier1 TINYINT,
qualifier2 SMALLINT,
qualifier3 INTEGER,
qualifier4 BIGINT,
qualifier5 DOUBLE,
qualifier6 BOOLEAN,
qualifier7 TIME,
qualifier8 DATE,
qualifier9 TIMESTAMP
)
WITH (
column_mapping = 'qualifier1:f1:q1, qualifier2:f1:q2,
qualifier3:f2:q3, qualifier4:f2:q4, qualifier5:f2:q5, qualifier6:f3:q1,
qualifier7:f3:q2, qualifier8:f3:q3, qualifier9:f3:q4',
row_id = 'rowId',
hbase_table_name = 'hbaseNamespace:hbaseTable',
external = false
);
lk> select * from hbase197.default.member7, hive.prestotest.score ;
rowid | age1 | sno | cno | degree
----------+------+-----+-------+--------
12345678 | 27 | 105 | 3-105 | 88
12345678 | 27 | 101 | 6-166 | 85
12345678 | 27 | 103 | 3-105 | 92
12345678 | 27 | 105 | 3-245 | 75
12345678 | 27 | 107 | 6-166 | 79
12345678 | 27 | 109 | 3-245 | 68
12345678 | 27 | 109 | 3-105 | 76
12345678 | 27 | 107 | 3-105 | 91
12345678 | 27 | 108 | 3-105 | 78
12345678 | 27 | 103 | 3-245 | 86
12345678 | 27 | 101 | 3-105 | 64
12345678 | 27 | 108 | 6-166 | 81
(12 rows)
Query 20201209_061624_00032_wevd7, FINISHED, 1 node
Splits: 46 total, 46 done (100.00%)
0:07 [25 rows, 14.7KB] [3 rows/s, 2.17KB/s]