当前位置: 首页 > 工具软件 > udf-sample > 使用案例 >

Hive使用,UDF的创建

呼延凌
2023-12-01

create database testdb;


自定义UDF
1.把程序打包放到目标机器上:hiveserver(hive使用端)
2.进入hive客户端,添加jar包
hive>add jar /usr/jar/hivetest.jar
3.创建临时函数
hive>create temporary function add_example as 'hive.udf.Add';
4.查询HQL语句:
select add_example(8,9) from score;
5.销毁临时函数
hive>drop temporary function add_example;


create table psn1(
id int,
name string,
likes ARRAY <string>,
address MAP <string,string>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED by ','
COLLECTION ITEMS TERMINATED by '-'
MAP KEYS TERMINATED by ':';

vi /var/tmp/data/psn1
1,xiaoming,book-tv-code,beijing:chaoyang-shanghai:pudong
2,lilei,book-code,huoxing:xxx-shanghai:pudong
3,hanmeimie,book-lilei,beijing:xxx

hive>load data local inpath '/var/tmp/data/psn1' into table psn1;


创建外部表
create external table psn2(
id int,
name string,
likes ARRAY <string>,
address MAP <string,string>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED by ','
COLLECTION ITEMS TERMINATED by '-'
MAP KEYS TERMINATED by ':'
LOCATION '/user/data';

外部表删除后,在数据库里没有表的数据,但在目录中该文件仍存在


CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, destination String)
COMMENT 'Employee details'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

sample.txt
1201 Gopal 45000 Technical manager
1202 Manisha 45000 Proof reader
1203 Masthanvali 40000 Technical writer
1204 Kiran 40000 Hr Admin
1205 Kranthi 30000 Op Admin
hive>load data local inpath '/user/sample.txt' into table employee;

创建分区
create table psn3(
id int,
name string,
likes ARRAY <string>,
address MAP <string,string>
)
PARTITIONED BY(year int)
ROW FORMAT DELIMITED
FIELDS TERMINATED by ','
COLLECTION ITEMS TERMINATED by '-'
MAP KEYS TERMINATED by ':'
LOCATION '/user/psn3';


alter table psn3 add partiton (year int);
alter table psn3 drop partition (year int);

load data local inpath '/user/sample.txt' into table employee partition (dept='IT');

load data [local] 'filepath' into table tb_name;
如果filepath是hdfs的目录,相当于将hdfs的文件移动到hive表目录
如果是本地目录,是将文件上传到hdfs,再将这个hdfs文件拷贝到对应表文件目录下

create table result_table (cnt int);

常用语保存运算或统计结果
from psn1
insert into table result_table
select count(*);

转载于:https://www.cnblogs.com/pashanhu/p/10950262.html

 类似资料: