计算 MySQL 表行的成本是多少？

当计算表中的所有行时，将使用什么索引？好吧，MySQL文档文档对此提供了一个直接的答案，引用：

InnoDB 通过遍历最小的可用二级索引来处理 SELECT COUNT（*） 语句除非索引或优化器提示指示优化器使用
不同的索引。如果不存在二级索引，则 InnoDB 通过扫描聚集索引来处理 SELECT COUNT（*） 语句。

通过一个简单的 sysbench 表示例，首先让我们检查一下查询计划：

mysql > explain select count(*) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | sbtest1 | NULL       | index | NULL          | k_1  | 4       | NULL | 986400 |   100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

正如预期的那样，查询优化器选择了二级索引并打算对其执行扫描。运行查询将相应地增加处理程序，并在慢速日志中报告相同的情况：

mysql > flush status;
Query OK, 0 rows affected (0.00 mysql > select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.17 sec)mysql > show status like 'Handler_read%';
+-----------------------+---------+
| Variable_name         | Value   |
+-----------------------+---------+
| Handler_read_first    | 1       |
| Handler_read_key      | 1       |
| Handler_read_last     | 0       |
| Handler_read_next     | 1000000 |
| Handler_read_prev     | 0       |
| Handler_read_rnd      | 0       |
| Handler_read_rnd_next | 0       |
+-----------------------+---------+
7 rows in set (0.01 sec)

第一个惊喜是，在上述执行后，慢日志没有报告任何rows_examined：

# Time: 2025-02-07T10:14:28.548037Z
# User@Host: msandbox[msandbox] @ localhost []  Id:     9
# Schema: db1  Last_errno: 0  Killed: 0
# Query_time: 0.173279  Lock_time: 0.000006  Rows_sent: 1  Rows_examined: 0  Rows_affected: 0  Bytes_sent: 62
use db1;
SET timestamp=1736696128;
select count(*) from sbtest1;

如果没有可用的辅助密钥怎么办？第二个示例表仅定义了主键：

mysql > desc sbtest2;
+-------+-----------+------+-----+---------+----------------+
| Field | Type      | Null | Key | Default | Extra          |
+-------+-----------+------+-----+---------+----------------+
| id    | int       | NO   | PRI | NULL    | auto_increment |
| k     | int       | NO   |     | 0       |                |
| c     | char(120) | NO   |     |         |                |
| pad   | char(60)  | NO   |     |         |                |
+-------+-----------+------+-----+---------+----------------+
4 rows in set (0.01 sec)mysql > explain select count(*) from sbtest2;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | sbtest2 | NULL       | index | NULL          | PRIMARY | 4       | NULL | 986400 |   100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

第二个惊喜来了：处理程序不会改变！

mysql > flush status;
Query OK, 0 rows affected (0.00 sec)mysql > select count(*) from sbtest2;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.04 sec)mysql > show status like 'Handler_read%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 0     |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

在另一个表中，相同的查询成本是否会有所不同，这次没有定义任何索引

mysql > desc sbtest3;
+-------+-----------+------+-----+---------+-------+
| Field | Type      | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id    | int       | NO   |     | 0       |       |
| k     | int       | NO   |     | 0       |       |
| c     | char(120) | NO   |     |         |       |
| pad   | char(60)  | NO   |     |         |       |
+-------+-----------+------+-----+---------+-------+
4 rows in set (0.00 sec)mysql > explain select count(*) from sbtest3;
+----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------+
|  1 | SIMPLE      | sbtest3 | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 985734 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------+
1 row in set, 1 warning (0.00 sec)mysql > select count(*) from sbtest3;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.05 sec)mysql > show status like 'Handler_read%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 0     |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

除了观察执行时间（与具有主键的表相当）之外，很难判断，因为状态处理程序既不会递增，也不会从慢速日志中Rows_examined信息。此外，全局 InnoDB 计数器（如 Innodb_rows_read）以及性能架构表统计信息在使用聚集索引（显式或隐式）进行计数时也不会更改！

mysql > select id from sbtest1 limit 1; select id from sbtest2 limit 1; select id from sbtest3 limit 1;
+--------+
| id     |
+--------+
| 731065 |
+--------+
1 row in set (0.00 sec)+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.01 sec)+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)mysql > select table_schema,table_name,rows_fetched from sys.schema_table_statistics where table_schema="db1";
+--------------+------------+--------------+
| table_schema | table_name | rows_fetched |
+--------------+------------+--------------+
| db1          | sbtest1    |            1 |
| db1          | sbtest2    |            1 |
| db1          | sbtest3    |            1 |
+--------------+------------+--------------+
3 rows in set (0.01 sec)mysql > select count(*) from sbtest1; select count(*) from sbtest2; select count(*) from sbtest3;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.21 sec)+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.03 sec)+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.03 sec)mysql > select table_schema,table_name,rows_fetched from sys.schema_table_statistics where table_schema="db1";
+--------------+------------+--------------+
| table_schema | table_name | rows_fetched |
+--------------+------------+--------------+
| db1          | sbtest1    |      1000001 |
| db1          | sbtest2    |            1 |
| db1          | sbtest3    |            1 |
+--------------+------------+--------------+
3 rows in set (0.01 sec)

当不涉及辅助键时，在上面的运行中显然会被忽略！那么我们如何检查在 count 查询执行期间实际读取了什么索引，以及必须获取多少数据呢？我只能想到一种方法 – 在新重新启动的空闲服务器上，我们可以检查 Buffer Pool 内容（前提innodb_buffer_pool_load_at_startup = OFF）

建议只在空闲时间进行检查！！！！！

mysql > select TABLE_NAME,INDEX_NAME,count(*) from INFORMATION_SCHEMA.INNODB_BUFFER_PAGE where TABLE_NAME like "`db1`.%" GROUP BY TABLE_NAME,INDEX_NAME;
Empty set (0.19 sec)mysql > select count(*) from sbtest1; select count(*) from sbtest2; select count(*) from sbtest3;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.22 sec)+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.20 sec)+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.87 sec)mysql > select TABLE_NAME,INDEX_NAME,count(*) from INFORMATION_SCHEMA.INNODB_BUFFER_PAGE where TABLE_NAME like "`db1`.%" GROUP BY TABLE_NAME,INDEX_NAME;
+-----------------+-----------------+----------+
| TABLE_NAME      | INDEX_NAME      | count(*) |
+-----------------+-----------------+----------+
| `db1`.`sbtest1` | k_1             |      833 |
| `db1`.`sbtest2` | PRIMARY         |    13713 |
| `db1`.`sbtest3` | GEN_CLUST_INDEX |    14302 |
+-----------------+-----------------+----------+
3 rows in set (0.26 sec)

根据上述结果，我们可以看到加载了多少个索引页面来执行计数，其中小型二级索引胜出，因为它不包含数据。在这种情况下，它并没有直接转化为执行时间，具体原因等我明白了再给大家开篇贴来说

计算 MySQL 表行的成本是多少？

相关文章

QUIC 与 UDP 关系

【IoCDI】_Spring的基本扫描机制

ChatGPT提问技巧：行业热门应用提示词案例--咨询法律知识

Redis企业开发实战(三)——点评项目之优惠券秒杀

预训练语言模型（笔记）

7.PPT：“中国梦”学习实践活动【20】

【Elasticsearch】Global 聚合

Facebook矩阵营销：多维度布局，精准打击