Oracle Tip: Understand how NULLs affect IN and EXISTS


On the surface, it may appear that the SQL clauses IN and EXISTS are interchangeable. However, they're quite different in how they handle NULL values and may give different results. The problem comes from the fact that, in an Oracle database, a NULL value means unknown, so any comparison or operation against a NULL value is also NULL, and any test that returns NULL is always ignored. For example, neither one of these queries return any rows:

select 'true' from dual where 1 = null;
select 'true' from dual where 1 != null;

The value 1 is neither equal nor not equal to NULL. Only IS NULL would return true on a NULL value and return a row.

select 'true' from dual where 1 is null;
select 'true' from dual where null is null;

When you use IN, you're telling SQL to take a value and compare it against every value or set of values in a list using =. If any NULL values exist, a row will not be returned--even if both values are NULL.

select 'true' from dual where null in (null);
select 'true' from dual where (null,null) in ((null,null));
select 'true' from dual where (1,null) in ((1,null));

An IN is functionally equivalent to the = ANY clause:

select 'true' from dual where null = ANY (null);
select 'true' from dual where (null,null) = ANY ((null,null));
select 'true' from dual where (1,null) = ANY ((1,null));

When you use an equivalent form of EXISTS, SQL counts rows and ignores the value(s) in the subquery--even if you return NULL.

select 'true' from dual where exists (select null from dual);
select 'true' from dual where exists (select 0 from dual where null is null);

The IN and EXISTS are logically the same. The IN clause compares values returned by the subquery and filters out rows in the outer query; the EXISTS clause compares values and filters out rows inside the subquery. In the case of NULL values, the resulting set of rows is the same.

select ename from emp where empno in (select mgr from emp);
select ename from emp e where exists (select 0 from emp where mgr = e.empno);

But problems arise when the logic is reversed to use NOT IN and NOT EXISTS, which return different sets of rows (the first query returns 0 rows; the second returns the intended data--they aren't the same query):

select ename from emp where empno not in (select mgr from emp);
select ename from emp e where not exists (select 0 from emp where mgr =
 e.empno);

The NOT IN clause is virtually equivalent to comparing each value with = and failing if any test is FALSE or NULL. For example:

select 'true' from dual where 1 not in (null,2);
select 'true' from dual where 1 != null and 1 != 2;
select 'true' from dual where (1,2) not in ((2,3),(2,null));
select 'true' from dual where (1,null) not in ((1,2),(2,3));

These queries don't return any rows. The second is more obvious, 1 != NULL is NULL, so the whole WHERE condition is false for that row. While these would work:

select 'true' from dual where 1 not in (2,3);
select 'true' from dual where 1 != 2 and 1 != 3;

You can still use the NOT IN query from before, as long as you prevent NULL from being returned in the results (again, these both work, but I'm assuming empno is not null, which is a good assumption in this case):

select ename from emp where empno not in (select mgr from emp where mgr is not
 null);
select ename from emp where empno not in (select nvl(mgr,0) from emp);

By understanding the difference between IN, EXISTS, NOT IN, and NOT EXISTS, you can avoid a very common problem when NULLs appear in the data of a subquery.

Scott Stephens worked for Oracle for more than 13 years in technical support, e-commerce, marketing, and software development.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章