故障013：易忘的NULL表达式

- 一、问题引入
- 二、探索之路
- - 2.1 数据准备
  - 2.2 回顾NULL表达式
  - 2.3 重现问题
  - - 2.3.1 分析原因
    - 2.3.2 如何化解预期？
- 三、知识总结

一、问题引入

某单位开发人员理直气壮抛出一张截图，以红色醒目地标记问题，好似挑刺。咦！！！数据库不对头哟，等于或不等于查出结果一样，由此断定数据库可能有问题，过于决绝。这种陷阱极易掉进，因为等于和不等于不应该都相同的结果。如果没人深扒其原因，估计被人唬住。然而其本质原因是忽略表中的数据特征，一味地自发想象预期美好，不切实际分析问题。
在这里插入图片描述

二、探索之路

2.1 数据准备

sql">drop table if exists test01;
drop table if exists test02;
create table test01 (c1 int, c2 int , c3 int);
create table test02 (d1 int, d2 int , d3 int);insert into test01 values(1, 10, 11);
insert into test01 values(2, null, 12);
insert into test01 values(3, null, 13);
insert into test01 values(4, null, 14);insert into test02 values(1, null, 11);
insert into test02 values(2, 20, 12);
insert into test02 values(3, 30, 13);
insert into test02 values(4, 40, 14);
commit;

2.2 回顾NULL表达式

解释：NULL在数据库是一个特殊的存在，不是具体数据（暂时无法确定具体值，可能后期会赋予具体意义的值），属于一个抽象的占位符，表示某类型的未知数据，可以理解为数学方程式中的X（未知数，不确定性值）。

当然针对NULL的判断，数据库有专门的运算符比较，即IS NULL 或IS NOT NULL，而不是 = 、<>、 != 常规运算符，则遇见非IS NULL/IS NOT NULL的普通比较表达式，结果永远是false。请记住这个理论。

sql">-- 举例1：判断某表某列非空有哪些记录
-- 正确写法
select * from test01 where c2 is not null;
select * from test01 where not c2 is null;
select * from test01 where nvl(c2, '<NA!>') != '<NA!>' ;
select * from test01 where coalesce(c2, '<NA!>') != '<NA!>' ;
select * from test01 where isnull(c2, '<NA!>') != '<NA!>' ;
select * from test01 where ifnull(c2, '<NA!>') != '<NA!>' ;
select * from test01 where decode(c2, null, 0, 1) = 1;
select * from test01 where lnnvl(c2 is null);
select * from test01 where c2 > 0;  -- 知道该表C2字段数值分布，等效的写法-- 错误写法（返回空结果集）
select * from test01 where c2 != null;
select * from test01 where c2 <> null;-- 普通比较运算永远false（返回空结果集）
select * from test01 where c2 = null;
select * from test01 where c2 > null;
select * from test01 where c2 < null;

在这里插入图片描述

2.3 重现问题

模拟问题引入环节所说的场景，两表关联

sql">select count(*) -- 0
from test01 t1 
inner join test02 t2 on  t1.c1 = t2.d1
where t1.c2 = t2.d2; select count(*) -- 0
from test01 t1 
inner join test02 t2 on  t1.c1 = t2.d1
where t1.c2 != t2.d2;

在这里插入图片描述

2.3.1 分析原因

sql">select t1.*, t2.* 
from test01 t1 
inner join test02 t2 on  t1.c1 = t2.d1;

在c1=d1等值关联条件成立情况下，进一步二级关联条件C2 = D2 或 C2 <> D2 ，从下图可知每次两表对二级关联条件匹配时存在一个NULL值，无论=、!= 比较运算，二级关联表达式永远false，导致最终结果集为空，正如count统计为零。

在这里插入图片描述

2.3.2 如何化解预期？

按他的预想是包含二级关联条件时存在NULL的情况，所以因在where过滤条件加入IS NULL判断。

sql">select t1.*, t2.* 
from test01 t1 
inner join test02 t2 on  t1.c1 = t2.D1
where t1.c2 = t2.d2 or t1.c2 is null or t2.d2 is null; select count(*) -- 4
from test01 t1 
inner join test02 t2 on  t1.c1 = t2.d1
where t1.c2 = t2.d2 or t1.c2 is null or t2.d2 is null;

在这里插入图片描述

三、知识总结

1）编写SQL未按预期执行，应先思考SQL逻辑是否得当？再观察数据本身特征，不能盲目自认为。
2）SQL当中NULL往往易遗忘，正由它的特殊性存在，一旦未考虑全面，有可能结果集是非预期（可能遗漏有效数据或增多无效数据）。
3）建议数据表设定default默认值属性，减少NULL比较的可能性，减轻脑仁的负荷运转。