NULL and aggregates

An aggregate in SQL Server will ignore NULL values. Check out this simple demonstration to learn how aggregates work with NULL values.

Run the following script to create a sample database for the demonstration:

CREATE DATABASE Test
GO

USE Test
GO

CREATE TABLE dbo.AggTest
	(
		COL1 INT
	)
GO

INSERT dbo.AggTest VALUES (10), (20), (NULL)

A simple sum on the column COL1 will of course return the value 30 (10+20):

SELECT SUM(COL1) FROM dbo.AggTest
Demo Aggregates And NULL Sum
Demo Aggregates And NULL Sum

An aggregare such as sum will return the same value even if there are NULL values or not. But things become more interesting with aggregates that depends on the number of rows, such as averages. In our example the average of 10,20 and NULL is 15 since one row is ignored:

SELECT AVG(COL1) FROM dbo.AggTest
Demo Aggregates And NULL Average
Demo Aggregates And NULL Average

If you want to count the rows with NULL when the average is calculated, the ISNULL function can be used:

SELECT AVG(ISNULL(COL1,0)) FROM dbo.AggTest
Demo Aggregates And NULL Average ISNULL
Demo Aggregates And NULL Average ISNULL

As you can see the average of 10,20 and NULL is now 10. The ISNULL function replaces the NULL with 0 and therefore it is included in the average aggregate. As this little demonstrations shows, you need to decide how aggregates should behave, especially with aggregates that uses the number of rows as part of the calculation.

Note that the two following statements return diffrent results:

SELECT COUNT(*) FROM dbo.AggTest
SELECT COUNT(COL1) FROM dbo.AggTest
Demo Aggregates And NULL Count
Demo Aggregates And NULL Count

You might think that logically, COUNT(*) is the same as COUNT(COL1) since there are no other columns, and that they should return the same result. But this actually makes “technical” sense, since COUNT(*) counts the total number of rows in the table, regardless of NULL settings on columns. A count of values in COL1 returns 2 since this is the number of rows that has a value.

ANSI warnings

Assuming you have the default ANSI settings, you’ll also get a warning when you aggregate columns that has NULL values:

[red_box]Warning: Null value is eliminated by an aggregate or other SET operation.[/red_box]

This message can be avoided with the ANSI_WARNINGS setting. The default is ON but can be changed with the SET statement:

SET ANSI_WARNINGS OFF

Note however, that this also controls if transactions are automatically rolled back when an divide by zero is encountered. When ANSI_WARNINGS are OFF, all open transactions are rolled back. When ANSI_WARNINGS are on, transactions are not rolled back. But in both cases the same error will be raised:

[red_box]Msg 8134, Level 16, State 1, Line 3 Divide by zero error encountered.[/red_box]

Tomas Lind

Tomas Lind - Consulting services as SQL Server DBA and Database Developer at High Coast Database Solutions AB.

This Post Has 2 Comments

  1. Ken

    Quick Point: If you are using heterogeneous queries (i.e. queries to a linked server) then ANSI NULLS MUST be set to ON to ensures consistent query semantics. If you try to set ANSI NULLS to OFF you will receive the following error:

    “Heterogeneous queries require the ANSI_NULLS and ANSI_WARNINGS options to be set for the connection…”

    The same holds true for ANSI Warnings. You will message if you try to set ANSI WARNINGS to OFF. You can avoid the message by deriving a value for NULLS in either the linked record set or natively in SQL Server.

    1. tompabo

      Thanks for the input!

Leave a Reply