如何用SQL语句来查询表或索引的行存/列存存储方式|OceanBase 用户问题集锦

server/2025/1/12 16:14:21/

一、问题背景

自OceanBase 4.3.0版本起,支持了列存引擎,允许表和索引以行存、纯列存或行列冗余的形式创建,且这些存储方式可以自由组合。除了使用 show create table命令来查看表和索引的存储类型外,也有用户询问如何通过SQL语句来查询表或索引的存储方式。那么,具体该如何操作呢?

1735278087

二、测试表

说明:这里仅列举了部分组合,还有其他的组合应该也是类似的,不再赘述,欢迎测试,拍砖。

-- 行存表,行存索引
create table  t1(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t1_c2(c2)) partition by hash(c1) partitions 3;
create table  t2(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t2_c2(c2)) partition by hash(c1) partitions 3 with column group(all columns);
create table  t3(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t3_c2(c2) with column group(all columns)) partition by hash(c1) partitions 3 with column group(all columns);-- 行存表,纯列存索引
create table  t4(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t4_c2(c2) with column group(each column)) partition by hash(c1) partitions 3;-- 行存表,行列混合索引
create table  t5(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t5_c2(c2) with column group(each column,all columns)) partition by hash(c1) partitions 3;-- 纯列存表,行存索引
create table  t6(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t6_c2(c2) with column group(all columns)) partition by hash(c1) partitions 3 with column group(each column);
create table  t7(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t7_c2(c2)) partition by hash(c1) partitions 3 with column group(each column);-- 行列混合表,行列混合索引
create table  t8(c1 int,c2 int,c3 int,c4 int,c5 int,primary key(c1),key idx_t8_c2(c2) with column group(each column,all columns)) partition by hash(c1) partitions 3 with column group(each column,all columns);

三、摸索

从列存相关的语法上可以看出,引入列存后新增加了 with column group (xxx) 的关键字,可以尝试搜一下哪些表的列上涉及了 column_group 相关的字段,从下面的结果看目前并没有标准表或者视图提供这样的信息。

MySQL [oceanbase]> select distinct table_name from __all_virtual_table 
where table_id in (select distinct table_id from __all_virtual_column where column_name like '%column_group%');
+--------------------------------------------+
| table_name                                 |
+--------------------------------------------+
| __all_table_history                        |
| __all_column_group                         |
| __all_column_group_history                 |
| __all_column_group_mapping                 |
| __all_column_group_mapping_history         |
| __all_virtual_core_all_table               |
| __all_virtual_table                        |
| __all_virtual_table_history                |
| __all_virtual_column_group                 |
| __all_virtual_column_group_mapping         |
| __all_virtual_column_group_history         |
| __all_virtual_column_group_mapping_history |
+--------------------------------------------+
12 rows in set (0.97 sec)

从 __all_virtual_column_group 表的 column_group_type 字段,凭感觉可以标识。

MySQL [oceanbase]> desc __all_virtual_column_group;
+-------------------+--------------+------+-----+---------+-------+
| Field             | Type         | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| tenant_id         | bigint(20)   | NO   | PRI | NULL    |       |
| table_id          | bigint(20)   | NO   | PRI | NULL    |       |
| column_group_id   | bigint(20)   | NO   | PRI | NULL    |       |
| gmt_create        | timestamp(6) | NO   |     | NULL    |       |
| gmt_modified      | timestamp(6) | NO   |     | NULL    |       |
| column_group_name | varchar(389) | NO   |     |         |       |
| column_group_type | bigint(20)   | NO   |     | NULL    |       |
| block_size        | bigint(20)   | NO   |     | NULL    |       |
| compressor_type   | bigint(20)   | NO   |     | NULL    |       |
| row_store_type    | bigint(20)   | NO   |     | NULL    |       |
+-------------------+--------------+------+-----+---------+-------+
10 rows in set (0.02 sec)

从代码 src/share/schema/ob_schema_struct.h 找到 column_group 类型的枚举值。

enum ObColumnGroupType : uint8_t
{DEFAULT_COLUMN_GROUP = 0,ALL_COLUMN_GROUP,ROWKEY_COLUMN_GROUP,SINGLE_COLUMN_GROUP,NORMAL_COLUMN_GROUP,MAX_COLUMN_GROUP
};

四、结论

经过测试发现如何标识 表或者索引是行存/纯列存/行列冗余的方式存储,这里之所以给 "结论"使用引号扩起来,原因:

1、受限于自己测试的 case 可能不完善,存在错误的情况,欢迎一起测试,交流。

2、随着版本的迭代,针对枚举值可能会有调整/比如增删等,应以实际版本中的枚举值为准。

  • 某个租户下当同一个 table_id 的 column_group_type 包含3 但是不包含1,输出:纯列存表
  • 某个租户下当同一个 table_id 的 column_group_type 包含1 和 3,输出:行列混合表
  • 其他情况均输出 :纯行存表

sql和结果" style="background-color:transparent;">五、查询sql和结果

说明:可以按需调整 tenant_name/database_name

sql_1(混合查询)">sql_1(混合查询)

select t1.tenant_name,t2.database_name,case when t2.table_type = 'user table' then t2.table_namewhen t2.table_type = 'index' then t2.index_nameend as table_name,t2.table_id,t2.data_table_id,t2.table_type,casewhen t2.table_type = 'user table' thencasewhen sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0 then '纯列存表'when sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) > 0 then '行列混合表'else '纯行存表'endwhen t2.table_type = 'index' thencasewhen sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0 then '纯列存索引'when sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) > 0 then '行列混合索引'else '纯行存索引'endend as storage_type,coalesce(t3.table_name, null) as main_table_name
from __all_virtual_column_group cg
join dba_ob_tenants t1 on cg.tenant_id = t1.tenant_id
join cdb_ob_table_locations t2 on cg.tenant_id = t2.tenant_id and cg.table_id = t2.table_id
left join cdb_ob_table_locations t3 on t2.data_table_id = t3.table_id
where t1.tenant_name in ('test1','test7')and t2.database_name in ('row_column_db','db100','db600')and t2.table_type in ('user table', 'index')
group by t1.tenant_name, t2.database_name, t2.table_name, t2.table_id, t2.data_table_id, t2.table_type, main_table_name
order by t1.tenant_name,t2.database_name, t2.table_name;

sql1_查询结果">sql1_查询结果
+-------------+---------------+------------+----------+---------------+------------+--------------------+-----------------+
| tenant_name | database_name | table_name | table_id | data_table_id | table_type | storage_type       | main_table_name |
+-------------+---------------+------------+----------+---------------+------------+--------------------+-----------------+
| test1       | row_column_db | t1         |   596454 |          NULL | USER TABLE | 纯行存表           | NULL            |
| test1       | row_column_db | t2         |   596462 |          NULL | USER TABLE | 纯行存表           | NULL            |
| test1       | row_column_db | t3         |   596470 |          NULL | USER TABLE | 纯行存表           | NULL            |
| test1       | row_column_db | t4         |   596478 |          NULL | USER TABLE | 纯行存表           | NULL            |
| test1       | row_column_db | idx_t1_c2  |   596455 |        596454 | INDEX      | 纯行存索引         | t1              |
| test1       | row_column_db | idx_t2_c2  |   596463 |        596462 | INDEX      | 纯行存索引         | t2              |
| test1       | row_column_db | idx_t3_c2  |   596471 |        596470 | INDEX      | 纯行存索引         | t3              |
| test1       | row_column_db | idx_t4_c2  |   596479 |        596478 | INDEX      | 纯列存索引         | t4              |
| test7       | db100         | t8         |   500070 |          NULL | USER TABLE | 行列混合表         | NULL            |
| test7       | db100         | idx_t8_c2  |   500071 |        500070 | INDEX      | 行列混合索引       | t8              |
| test7       | db600         | t5         |   500045 |          NULL | USER TABLE | 纯行存表           | NULL            |
| test7       | db600         | t6         |   500053 |          NULL | USER TABLE | 纯列存表           | NULL            |
| test7       | db600         | t7         |   500061 |          NULL | USER TABLE | 纯列存表           | NULL            |
| test7       | db600         | idx_t5_c2  |   500046 |        500045 | INDEX      | 行列混合索引       | t5              |
| test7       | db600         | idx_t6_c2  |   500054 |        500053 | INDEX      | 纯行存索引         | t6              |
| test7       | db600         | idx_t7_c2  |   500062 |        500061 | INDEX      | 纯行存索引         | t7              |
+-------------+---------------+------------+----------+---------------+------------+--------------------+-----------------+
16 rows in set (3.45 sec)

sql2_查询纯列存的表/索引">sql2_查询纯列存的表/索引

select t1.tenant_name,t2.database_name,case when t2.table_type = 'user table' then t2.table_namewhen t2.table_type = 'index' then t2.index_nameend as table_name,t2.table_id,t2.data_table_id,t2.table_type,casewhen t2.table_type = 'user table' thencasewhen sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0 then '纯列存表'when sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) > 0 then '行列混合表'else '纯行存表'endwhen t2.table_type = 'index' thencasewhen sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0 then '纯列存索引'when sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) > 0 then '行列混合索引'else '纯行存索引'endend as storage_type,coalesce(t3.table_name, null) as main_table_name
from __all_virtual_column_group cg
join dba_ob_tenants t1 on cg.tenant_id = t1.tenant_id
join cdb_ob_table_locations t2 on cg.tenant_id = t2.tenant_id and cg.table_id = t2.table_id
left join cdb_ob_table_locations t3 on t2.data_table_id = t3.table_id
where t1.tenant_name in ('test1','test7')and t2.database_name in ('row_column_db','db100','db600')and t2.table_type in ('user table', 'index')
group by t1.tenant_name, t2.database_name, t2.table_name, t2.table_id, t2.data_table_id, t2.table_type, main_table_name
having(t2.table_type = 'user table' and sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0)or(t2.table_type = 'index' and sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0)
order by t1.tenant_name, t2.database_name, t2.table_name;

sql2_查询结果">sql2_查询结果
+-------------+---------------+------------+----------+---------------+------------+-----------------+-----------------+
| tenant_name | database_name | table_name | table_id | data_table_id | table_type | storage_type    | main_table_name |
+-------------+---------------+------------+----------+---------------+------------+-----------------+-----------------+
| test1       | row_column_db | idx_t4_c2  |   596479 |        596478 | INDEX      | 纯列存索引      | t4              |
| test7       | db600         | t6         |   500053 |          NULL | USER TABLE | 纯列存表        | NULL            |
| test7       | db600         | t7         |   500061 |          NULL | USER TABLE | 纯列存表        | NULL            |
+-------------+---------------+------------+----------+---------------+------------+-----------------+-----------------+
3 rows in set (3.48 sec)

sql3_查询行列冗余的表/索引">sql3_查询行列冗余的表/索引

SELECT tenant_name,database_name,table_name,table_id,data_table_id,table_type,storage_type,main_table_name
FROM (select t1.tenant_name,t2.database_name,case when t2.table_type = 'user table' then t2.table_namewhen t2.table_type = 'index' then t2.index_nameend as table_name,t2.table_id,t2.data_table_id,t2.table_type,casewhen t2.table_type = 'user table' thencasewhen sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0 then '纯列存表'when sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) > 0 then '行列混合表'else '纯行存表'endwhen t2.table_type = 'index' thencasewhen sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) = 0 then '纯列存索引'when sum(case when cg.column_group_type = 3 then 1 else 0 end) > 0 and sum(case when cg.column_group_type = 1 then 1 else 0 end) > 0 then '行列混合索引'else '纯行存索引'endend as storage_type,coalesce(t3.table_name, null) as main_table_namefrom __all_virtual_column_group cgjoin dba_ob_tenants t1 on cg.tenant_id = t1.tenant_idjoin cdb_ob_table_locations t2 on cg.tenant_id = t2.tenant_id and cg.table_id = t2.table_idleft join cdb_ob_table_locations t3 on t2.data_table_id = t3.table_idwhere t1.tenant_name in ('test1','test7')and t2.database_name in ('row_column_db','db100','db600')and t2.table_type in ('user table', 'index')group by t1.tenant_name, t2.database_name, t2.table_name, t2.table_id, t2.data_table_id, t2.table_type, main_table_name
) subquery
WHERE storage_type IN ('行列混合表', '行列混合索引')
ORDER BY tenant_name, database_name, table_name;

sql3_查询结果" style="background-color:transparent;">sql3_查询结果
+-------------+---------------+------------+----------+---------------+------------+--------------------+-----------------+
| tenant_name | database_name | table_name | table_id | data_table_id | table_type | storage_type       | main_table_name |
+-------------+---------------+------------+----------+---------------+------------+--------------------+-----------------+
| test7       | db100         | idx_t8_c2  |   500071 |        500070 | INDEX      | 行列混合索引       | t8              |
| test7       | db100         | t8         |   500070 |          NULL | USER TABLE | 行列混合表         | NULL            |
| test7       | db600         | idx_t5_c2  |   500046 |        500045 | INDEX      | 行列混合索引       | t5              |
+-------------+---------------+------------+----------+---------------+------------+--------------------+-----------------+
3 rows in set (3.33 sec)

http://www.ppmy.cn/server/157251.html

相关文章

w148基于spring boot的文档管理系统的设计与实现

🙊作者简介:多年一线开发工作经验,原创团队,分享技术代码帮助学生学习,独立完成自己的网站项目。 代码可以查看文章末尾⬇️联系方式获取,记得注明来意哦~🌹赠送计算机毕业设计600个选题excel文…

PyCharm 引用其他路径下的文件报错 ModuleNotFound 或报红

PyCharm 中引用其他路径下的文件提示 ModuleNotFound,将被引用目录添加到系统路径: # # 获取当前目录 dir_path os.path.dirname(os.path.realpath(__file__)) # # 获取上级目录 parent_dir_path os.path.abspath(os.path.join(dir_path, os.pardir))…

C语言基础:野指针、空指针、空悬指针

野指针、空指针、空悬指针 野指针 定义:只想一块未知区域(以及销毁或者访问受限的内存区域外的已存在或不存在的内存区域)的指针,被称作野指针。野指针是危险的。 危害: ① 引用野指针,相当于访问了非法…

el-table 多级表头

1.结构 <el-table:data"tableData"border:height"700"style"width: 100% !important; overflow: auto":header-cell-style"{ background: #becee1, color: #333 }":cell-style"{ padding: 5px }"><template v-for…

数据结构:二叉搜索树详解

二叉搜索树详解 一、二叉搜索树的概念二、 二叉搜索树的性能分析&#xff08;一&#xff09;二叉搜索树的效率&#xff08;二&#xff09;数组结构和二叉排序树的比较 三、二叉排序树的插入&#xff08;一&#xff09;操作规则&#xff08;二&#xff09;代码实现 四、二叉排序…

rust如何在类中保存一个闭包,它可以捕获类自己?

首先这个为什么在 Rust 中这么难实现&#xff1f; 我们得承认一个残酷的现实&#xff1a;Rust 不是 C。在 C 中&#xff0c;可以随心所欲地使用引用&#xff0c;编译器不会过多干涉。但在 Rust 中&#xff0c;编译器会像严厉的监管者一样盯着我们的每一个动作&#xff0c;生怕…

软件测试的未来:如何跨越自动化到自主测试的鸿沟

近年来&#xff0c;随着人工智能&#xff08;AI&#xff09;的飞速发展&#xff0c;软件测试领域迎来了技术变革。从传统的手工测试到自动化测试&#xff0c;再到如今炙手可热的自主测试&#xff0c;测试技术正以前所未有的速度演进。然而&#xff0c;从自动化测试到自主测试&a…

通过架构提升后端接口性能

引言 在现代软件开发中,后端接口的性能优化是确保系统高效运行的关键因素之一。 随着用户数量的增加和数据量的增长,未优化的后端接口会导致响应时间变长,用户体验下降,甚至引发系统崩溃。 本文将探讨一些常见且有效的后端接口性能优化方法,并通过具体的Java代码实例来展…