目录
- hive官方函数解释
- 示例
- 实战
hive官方函数解释
hive官网函数大全地址: hive官网函数大全地址
Return Type | Name | Description |
---|---|---|
array | array(value1, value2, …) | Creates an array with the given elements. |
boolean | array_contains(Array, value) | Returns TRUE if the array contains value. |
array | sort_array(Array) | Sorts the input array in ascending order according to the natural ordering of the array elements and returns it (as of version 0.9.0). |
T | explode(ARRAY a) | Explodes an array to multiple rows. Returns a row-set with a single column (col), one row for each element from the array. |
int,T | posexplode(ARRAY a) | Explodes an array to multiple rows with additional positional column of int type (position of items in the original array, starting with 0). Returns a row-set with two columns (pos,val), one row for each element from the array. |
示例
1、array(value1, value2, …)
创建一个 Array 对象,由一组值组成。
SELECT array(85, 90, 78, 92, 86) AS scores;
---结果
scores
[85,90,78,92,86]
2、array_contains(Array, value)
判断一个值是否在 Array 对象中存在。
SELECT array_contains(array(85, 90, 78, 92, 86), 60) AS is_passed;
---结果
is_passed
falseSELECT if(array_contains(array(85, 90, 78, 92, 86), 60),1,0) AS is_passed;
---结果
is_passed
0
3、sort_array(Array)
sort_array 函数可以用于对 Array 对象中的元素进行排序。具体来说,sort_array 函数会将 Array 对象中的元素按照升序顺序进行排序,并返回一个新的排序后的 Array 对象。
SELECT sort_array(array(85, 90, 78, 92, 86)) AS sorted_scores;
---结果
sorted_scores
[78,85,86,90,92]
4、explode (array)
explode 函数可以用于将一个 Array 对象拆分成多行。具体来说,explode 函数会将 Array 对象中的每个元素拆分成一行,并与原始数据集中的其他字段一起返回。
select explode(array('A','B','C'));
select explode(array('A','B','C')) as col;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;---上述四个结果均为
col
A
B
C
5、posexplode (array)
select posexplode(array('A','B','C'));
select posexplode(array('A','B','C')) as (pos,val);
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf as pos,val;
---上述四个结果均为
pos col
0 A
1 B
2 C
实战
下面全年级的学生成绩单的部分,包含学生姓名,课程,分数,年级排名,要求将每人的成绩进行汇总整理:按排名,课程和成绩的顺序
数据:
'小明' ,'语文' ,80 as score , 100 '小明' ,'数学' ,90 as score, 85 '小明' ,'英语' ,75 as score, 203 '小花' ,'语文' ,85 as score , 90 '小花' ,'数学' ,65 as score , 350 '小花' ,'英语' ,90 as score , 20
代码实现:
with students as(select '小明' as name,'语文' as course ,80 as score , 100 as rkunion all select '小明' as name,'数学' as course ,90 as score, 85 as rkunion all select '小明' as name,'英语' as course ,75 as score, 203 as rkunion all select '小花' as name,'语文' as course ,85 as score , 90 as rkunion all select '小花' as name,'数学' as course ,65 as score , 350 as rkunion all select '小花' as name,'英语' as course ,90 as score , 20 as rk
)
SELECT name, concat_ws(',', sort_array(collect_list(concat_ws(':', lpad(cast(rk as string), 4, '0'), course, cast(score as string) )))) AS sorted_scores
FROM students
GROUP BY name;
---结果
name sorted_scores
小明 0085:数学:90,0100:语文:80,0203:英语:75
小花 0020:英语:90,0090:语文:85,0350:数学:65