hive中map相关函数总结

news/2024/10/17 15:29:10/

目录

  • hive官方函数解释
  • 示例
  • 实战

hive官方函数解释

hive官网函数大全地址: hive官网函数大全地址

Return TypeNameDescription
mapmap(key1, value1, key2, value2, …)Creates a map with the given key/value pairs.
arraymap_values(Map<K.V>)Returns an unordered array containing the values of the input map.
arraymap_keys(Map<K.V>)Returns an unordered array containing the keys of the input map.
map<string,string>str_to_map(text[, delimiter1, delimiter2])Splits text into key-value pairs using two delimiters. Delimiter1 separates text into K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ‘,’ for delimiter1 and ‘:’ for delimiter2.
Tkey,Tvalueexplode(MAP<Tkey,Tvalue> m)Explodes a map to multiple rows. Returns a row-set with a two columns (key,value) , one row for each key-value pair from the input map. (As of Hive 0.8.0.).

示例

1、map(key1, value1, key2, value2, …)

SELECT map('name', '张三', 'age', 20, 'gender', '男') AS student;
---结果:
student	
{"age":"20","gender":"男","name":"张三"}

2、map_values(Map<K.V>)

SELECT map_keys(map('name', '张三', 'age', 20, 'gender', '男')) AS keys;
---结果:
keys
["name","age","gender"]

3、map_values(Map<K.V>)

SELECT map_values(map('name', '张三', 'age', 20, 'gender', '男')) AS values;
---结果:
values	
["张三","20","男"]

4、str_to_map(str, delimiter1, delimiter2)
str_to_map 函数用于将一个字符串转换为 Map 对象。具体来说,str_to_map 函数会将一个由键值对组成的字符串解析成一个 Map 对象,其中键和值之间使用指定的分隔符进行分隔。其中,str 是要转换的字符串,delimiter1 是键值对之间的分隔符,delimiter2 是键和值之间的分隔符。默认情况下,delimiter1 的值是 ‘,’,delimiter2 的值是 ‘:’。

SELECT str_to_map('name:张三,age:20,gender:男', ',', ':') AS student;
---结果:
student	
{"age":"20","gender":"男","name":"张三"}SELECT str_to_map('name=张三,age=20,gender=男', ',', '=') AS student;
---结果:
student	
{"age":"20","gender":"男","name":"张三"}

5、explode (map)

select explode(map('A',10,'B',20,'C',30));
select explode(map('A',10,'B',20,'C',30)) as (key,value);
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;
---上述四个结果均为:
key     value
A       10	
B       20	
C       30

实战

给出一组学生数据,有名字,课程,等级,分数等字段,现在求每门课的情况,包含平均成绩,及这门课包含哪些学生及学生的等级

with stud as
( select  'zhang3' as name ,'优' as grade  ,'math' as course ,'88' as score  union all select  'li4' as name ,'良' as grade  ,'math' as course ,'72' as scoreunion all select  'zhao6' as name ,'差' as grade  ,'math' as course ,'44' as scoreunion all select  'wang5' as name ,'优' as grade  ,'chinese' as course ,'80' as scoreunion all select  'zhao6' as name ,'优' as grade  ,'chinese' as course ,'55' as scoreunion all select  'tian7' as name ,'优' as grade  ,'chinese' as course ,'75' as score
)--sql1
select course, collect_set(concat(name,':',grade)) as collect , avg(score) from stud group by course;
---结果:
course             collect                                             avg(score)	
math        ["li4:良","zhao6:差","zhang3:优"]                           68.0
chinese     ["wang5:优","tian7:优","zhao6:优"]                          70.0
----sql2
select course, concat_ws(',',collect_set(concat(name,':',grade))) as strings , avg(score) from stud group by course;
---结果:
course                      strings                                        avg(score)
math             li4:良,zhao6:差,zhang3:优                                  68.0
chinese          wang5:优,tian7:优,zhao6:优                                 70.0
----sql3
select course, str_to_map(concat_ws(',',collect_set(concat(name,':',grade))),',',':') as maps , avg(score) from stud group by course;
---结果:
course                               maps                              avg(score)	
math                 {"li4":"良","zhang3":"优","zhao6":"差"}             68.0
chinese              {"tian7":"优","wang5":"优","zhao6":"优"}            70.0

注意:
第一种sql,collect 字段的类型是array;第二种sql,strings字段的类型是string;第三种sql,maps字段的类型是map;
问题来了,能否在第二种的基础上,实现第一种和第三种的结果,且字段类型是string;
下面实现第二种转化为第三种,实际上就是map格式转换成json字符串;

with stud as
( select  'zhang3' as name ,'优' as grade  ,'math' as course ,'88' as score  union all select  'li4' as name ,'良' as grade  ,'math' as course ,'72' as scoreunion all select  'zhao6' as name ,'差' as grade  ,'math' as course ,'44' as scoreunion all select  'wang5' as name ,'优' as grade  ,'chinese' as course ,'80' as scoreunion all select  'zhao6' as name ,'优' as grade  ,'chinese' as course ,'55' as scoreunion all select  'tian7' as name ,'优' as grade  ,'chinese' as course ,'75' as score
)select 
course
,concat('{"',string2,'"}') as string3
from  
(select 
course
,regexp_replace(string1,'\\,','\\"\\,\\"') as string2
from  
(
select course,concat_ws(',', collect_list(concat_ws('":"', k,v) ) ) as string1
from (
select course, str_to_map(concat_ws(',',collect_set(concat(name,':',grade))),',',':') as maps , avg(score) 
from stud group by course
)test_map_1
lateral view outer explode(maps) kv as k,v
group by course
) tt 
) tm ---结果:
course                               string3                            	
math                 {"li4":"良","zhang3":"优","zhao6":"差"}           
chinese              {"tian7":"优","wang5":"优","zhao6":"优"}        

http://www.ppmy.cn/news/1279167.html

相关文章

华为OD机试真题- 小华最多能得到多少克黄金-2023年OD统一考试(C卷)

题目描述: 小华按照地图去寻宝,地图上被划分成 m 行和 n 列的方格,横纵坐标范围分别是 [0, n−1]和[0, m-1]。在横坐标和纵坐标的数位之和不大于k的方格中存在黄金(每个方格中仅存在一克黄金),但横坐标和纵坐标之和大于k的方格存在危险不可进入。小华从入口(0,0)进入,任…

企业出海-如何保护客户账户安全?

近年来国内企业竞争日益激烈&#xff0c;许多企业在这般环境下难以持续发展。那么该如何获得业务的可持续性增长&#xff0c;如何获取更多的客户的同时开阔公司的视野&#xff1f;出海便是如今帮助国内企业能快速发展壮大的潮流之一&#xff0c;摆脱了局限于国内发展的束缚奔向…

医学实验室检验科LIS信息系统源码

实验室信息管理是专为医院检验科设计的一套实验室信息管理系统&#xff0c;能将实验仪器与计算机组成网络&#xff0c;使病人样品登录、实验数据存取、报告审核、打印分发&#xff0c;实验数据统计分析等繁杂的操作过程实现了智能化、自动化和规范化管理。 实验室管理系统功能介…

ubuntu22.04 将python源切换为清华源

在 Ubuntu 22.04 中将 Python 包管理器 pip 的源切换到清华大学镜像源可以加快包的下载速度&#xff0c;特别是在中国大陆地区。以下是详细的步骤&#xff1a; 方法 1: 临时使用清华源 在使用 pip 安装包时&#xff0c;您可以通过 --index-url 参数临时指定清华源。例如&…

华为鸿蒙开发适合哪些人学习?

随着鸿蒙系统的崛起&#xff0c;越来越多的人开始关注鸿蒙开发&#xff0c;并希望成为鸿蒙开发者。然而&#xff0c;鸿蒙开发并不适合所有人&#xff0c;那么哪些人最适合学习鸿蒙开发呢&#xff1f;本文将为您总结鸿蒙开发适合的人群。 一、具备编程基础的人 学习鸿蒙开发需要…

产品原型设计软件 Axure RP 9 mac支持多人写作设计

axure rp 9 mac是一款产品原型设计软件&#xff0c;它可以让你在上面任意构建草图、框线图、流程图以及产品模型&#xff0c;还能够注释一些重要地方&#xff0c;axure rp汉化版可支持同时多人写作设计和版本管理控制&#xff0c;这款交互式原型设计工具可以帮助设计者制作出高…

NiNNet

目录 一、网络介绍 1、全连接层存在的问题 2、NiN的解决方案(NiN块) 3、NiN架构 4、总结 二、代码实现 1、定义NiN卷积块 2、NiN模型 3、训练模型 一、网络介绍 NiN&#xff08;Network in Network&#xff09;是一种用于图像识别任务的卷积神经网络模型。它由谷歌研究…

明明白白Linux之基础教程

Linux基础教程 1、Linux的介绍 Linux是一个自由和开放源代码的操作系统内核&#xff0c;广泛应用于各种计算设备和系统中。它起源于Finland的Linus Torvalds在1991年开始开发&#xff0c;并迅速发展成为一个强大和可靠的操作系统。 以下是特点和重要组成部分&#xff1a; 开…