在Mysql中可以通过group_concat()
函数实现分组字符串拼接,在HIVE SQL中可以使用concat_ws()+collect_set()/collect_list()
函数实现相同的效果。
实例:
a | b | c |
---|---|---|
2014 | B | 9 |
2015 | A | 8 |
2014 | A | 10 |
2015 | B | 7 |
2014 | B | 6 |
1.concat_ws+collect_list 非去重拼接
select a ,concat_ws('-',collect_list(b)) as col_b,concat_ws('-',collect_list(cast(c as string))) as col_c
from tb_name
group by a
;
查询结果:
a | col_b | col_c |
---|---|---|
2014 | B-A-B | 9-10-6 |
2015 | A-B | 8-7 |
2.concat_ws+collect_set 去重拼接
select a ,concat_ws('-',collect_set(b)) as col_b,concat_ws('-',collect_set(cast(c as string))) as col_cfrom tb_name
group by a
;
查询结果:
a | col_b | col_c |
---|---|---|
2014 | B-A | 9-10-6 |
2015 | A-B | 8-7 |
3.如果在分组拼接的时候需要保持拼接结果为有序排序,可以通过以下两种方式
1)先排序再拼接
select a ,concat_ws('-',collect_set(b)) as col_b,concat_ws('-',collect_set(cast(c as string))) as col_cfrom
(
select a ,b,c,row_number() over (partition by a order by b asc) as rn
from tb_name
) t
group by a
;
查询结果:
a | col_b | col_c |
---|---|---|
2014 | A-B | 10-6-9 |
2015 | A-B | 8-7 |
select a ,concat_ws('-',collect_list(b)) as col_b,concat_ws('-',collect_list(cast(c as string))) as col_cfrom
(
select a ,b,c,row_number() over (partition by a order by b asc) as rn
from tb_name
) t
group by a
;
查询结果:(col_b与col_c 的位置对应且col_b中的字符有序)
a | col_b | col_c |
---|---|---|
2014 | A-B-B | 10-6-9 |
2015 | A-B | 8-7 |
2)sort_array()函数升序排列
select a ,concat_ws('-',sort_array(collect_list(b))) as col_bfrom tb_name
group by a
;
a | col_b |
---|---|
2014 | A-B-B |
2015 | A-B |