Temporal tables was introduced in SQL Server 2016, and is a technique that simplifies keeping track of changed records in a table. The handling of changes of records in a data warehouse dimension table is a good example of when temporal tables can be useful. In this blog post, I’ll cover the basics of temporal tables.
Slowly changing dimensions
The concept of slowly changing dimensions covers different strategies for handling changes in data warehouse dimensions.
As an example to illustrate common problems with dimensions that change over time, say we have a company that is selling cars and trucks. We are selling a car (Volvo XC90) that belong to the product group Private Cars. This product group has been very profitable during the latest years. For instance, the profit for 2015 was 10M.
Now we have decided that the current set of product groups are too coarse, so we will create additional product groups but we also keep the old ones. One of the new product groups is SUV. We decide that the Volvo XC90 belongs to that product group.
Now the question arises how to handle this type of change in our data warehouse dimension for products. The concept of slowly changing dimensions was invented by Ralph Kimball, and there are seven strategies (types) to deal with changes in dimensions.
The most basic techniques are covered in types 0 to 4:
Type 0: retain original
Means we do not change the product at all. The Volvo XC90 still belongs to the product group Private Cars. Perhaps we instead add a new product dimension that reflects the new structure?
Type 1: overwrite
With this strategy, we overwrite the old values. The Volvo XC90 now belongs to the product group SUV. The problem with this method is that it also changes history. Suddenly, the profit for the product group Private Cars 2015 drops to 7 Mil. Changes in historical facts may be confusing to the users of the data warehouse. Why does the data change, long after the fact occured?
Type 2: add new row
Here we add a new row to the dimension, reflecting the fact that the Volvo XC90 now belongs to the product group SUV. But we also keep the old row. Somehow we therefore need to differentiate between rows so that we’ll get only one matching row when joining from fact tables. A common method is to add columns showing the period for when the dimension record was valid:
NULL is used to illustrate an open end or beginning. E.g. the current row for each product has a NULL value in the ValidTo column.
With this method, the profit for previous years is unaffected since those fact table rows points to CarId 1. New sales (2016-08-01 and later) points to CarId 2.
Type 3: add new attribute
This method preserves the previous value along with the current value. This method can be useful when the complete history of changes is unnecessary. In our example we would have one column named CurrentProductGroup and one column named PreviousProductGroup. And perhaps a column showing the date time when the change occurred.
Type 4: add history table
Before a record is changed or deleted, a copy is written to a separate history table. This way, the dimension table always contains the current row (like the Type 1 method). But history is preserved in a separate table, for instance with a _history suffix. The joining from fact tables is straight forward since there is only one matching record.
In earlier versions of SQL Server (before SQL Server 2016), an implementation of a type 4 dimension table often involved triggers that handled the writing to the history table. Also, querying historical data required special syntax to get the desired matching record from history.
Temporal tables closely resemble the type 4 method.
Temporal tables basics
With temporal tables you no longer need to create custom solutions (e.g. triggers) to keep the historical records. Instead you define the table as a system-versioned temporal table. The table will contain the current row, and a history table will contain previous values when records are changed or deleted. Picture from MSDN:
Creating temporal tables
There are a couple of requirements on temporal table candidates. First, they need to have a primary key (!). Second, there must be two additional columns for row start time for row end time. Those columns show the period that the row was valid.
Let’s create the table in the example above as a temporal table and insert one record. Note the additional statements (highlighted) to include temporal table features:
CREATE DATABASE TemporalTablesTest GO USE TemporalTablesTest GO CREATE TABLE dbo.Cars ( CarId INT NOT NULL CONSTRAINT CarId_pk PRIMARY KEY CLUSTERED ,CarName VARCHAR(100) NOT NULL ,ProductGroup VARCHAR(100) NOT NULL ,ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START ,ValidTo DATETIME2 GENERATED ALWAYS AS ROW END ,PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Cars_history)) GO INSERT Cars (CarId,CarName,ProductGroup) VALUES (1, 'Volvo XC90','Private Cars') GO
As you can see the ValidFrom and ValidTo columns are automatically populated when we do the insert:
Right now, the table dbo.Cars_history is empty. This is because we haven’t done any deletes or updates yet.
In object explorer, temporal tables are differentiated with a new icon and the added text “System-Versioned”:
Making an existing table a temporal table
If you want to turn an existing table into a temporal table, a similar syntax is used. Let’s say the cars table already exists, but without the ValidFrom and ValidTo columns (first we drop the old Cars table):
ALTER TABLE dbo.Cars SET (SYSTEM_VERSIONING = OFF) DROP TABLE dbo.Cars DROP TABLE dbo.Cars_history CREATE TABLE dbo.Cars ( CarId INT NOT NULL CONSTRAINT CarId_pk PRIMARY KEY CLUSTERED ,CarName VARCHAR(100) NOT NULL ,ProductGroup VARCHAR(100) NOT NULL ) GO INSERT Cars (CarId,CarName,ProductGroup) VALUES (1, 'Volvo XC90','Private Cars') GO
To add temporal table features to this table, use the following script:
ALTER TABLE dbo.Cars ADD ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START CONSTRAINT DF_ValidFrom DEFAULT '1900-01-01 00:00:00.0000000' ,ValidTo DATETIME2 GENERATED ALWAYS AS ROW END CONSTRAINT DF_ValidTo DEFAULT '9999-12-31 23:59:59.9999999' ,PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) GO ALTER TABLE dbo.Cars SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Cars_history)) GO
The script adds a period and a history table, so from now on changes and deletes will be stored in the history table. Note that default values are generated so if you run SQL Server Standard Edition and if you have many rows, this may be an expensive operation.
Inserting and updating temporal tables
Inserts works as usual. A new record will be inserted into the normal table but not into the history table. It is only updates and deletes that create records in the history table. Let’s again change the product group to SUV:
UPDATE Cars SET ProductGroup = 'SUV' WHERE CarId = 1
Now we can check the contents of both the normal and temporal table:
SELECT * FROM dbo.Cars SELECT * FROM dbo.Cars_history
As you can see the first table shows the current value for product group. The ValidFrom value shows the time when the update was performed. The ValidTo column shows 9999-12-31 23:59:59.9999999, the value that is used to indicate an open end (the current record).
The second table is the history table and shows the previous value for the product group. ValidFrom is the value when the row was created/updated/deleted, or (as in this case) the default value (1900-01-01 00:00:00.0000000) for records already existing when we turned the table into a temporal table. The ValidTo shows the time when the record was no longer valid. That is, the same point in time when there was a replacement record (the update statement).
Deleting from temporal tables
Deletes will remove the record from the normal table, but there will also be an insert to the history table to preserve the record:
DELETE FROM Cars WHERE CarId = 1
SELECT * FROM dbo.Cars SELECT * FROM dbo.Cars_history
Querying temporal tables
If you want to use current records in a dimension, the query is written in the same way as when you query a non-temporal table. But if you want to use historical records, the FOR SYSTEM_TIME clause must be used. You then have to use one of the five sub-clauses that belongs to FOR SYSTEM_TIME:
AS OF is used when you want to see how the data looked at in a specific point in time. E.g.:
SELECT * FROM dbo.Cars FOR SYSTEM_TIME AS OF '2016-10-18 21:50:00'
FROM TO shows all versions of the records that was active in given time period. That is, records that started before the FROM date and that ended before the TO date.
SELECT * FROM dbo.Cars FOR SYSTEM_TIME FROM '2016-10-18 00:00:00' TO '2016-10-19 00:00:00'
BETWEEN AND is very similar to FROM TO. The difference is that BETWEEN AND also includes records that was active after the upper boundary (AND).
SELECT * FROM dbo.Cars FOR SYSTEM_TIME BETWEEN '2016-10-18 00:00:00' AND '2016-10-19 00:00:00'
CONTAINED IN will only return records whose lifetime occurred completely within the time period.
SELECT * FROM dbo.Cars FOR SYSTEM_TIME CONTAINED IN ('2016-10-18 00:00:00', '2016-10-19 00:00:00')
The ALL sub-clause simply returns all records, both historical and current.
Modifying temporal tables
If you add new columns to a temporal table they will be added automatically in the history table as well. The same goes for dropping columns, unless they are used for temporal table functionality. The period columns are used in an auto-created index called ix_Cars_history so that will prevent you from dropping it. But if you first drop the index and then try to drop the column you get the following error message:
Column ‘ValidFrom’ in table ‘TemporalTablesTest.dbo.Cars’ cannot be dropped because it is a part of period definition.
To do such changes, the system versioning has to be turned off first:
ALTER TABLE dbo.Cars SET (SYSTEM_VERSIONING = OFF)
System versioning has to be turned off if you want to drop a temporal table. Else the following error will occur:
Drop table operation failed on table ‘TemporalTablesTest.dbo.Cars’ because it is not supported operation on system-versioned temporal tables.
To drop the table, use the following syntax (remember to drop the history table as well):
ALTER TABLE dbo.Cars SET (SYSTEM_VERSIONING = OFF) DROP TABLE dbo.Cars DROP TABLE dbo.Cars_history
Temporal tables indexing
Both the history table and the normal table can be indexed as any other table.