You have a table A with new data, and a table B with old data. How do you make sure table B is refreshed with data from table A? You want to insert new records, update existing records and to delete removed records.
The T-SQL merge statement can do all that for you.
The above scenario is very common when loading data into a data warehouse. You probably start with some sort of staging table that holds the current records, and from that stage table you want to move the records into your data warehouse dimension table. Of course, you can delete all old records and simply insert the new ones. But that isn’t really a good solution if you want to keep the history on each record. (Perhaps by using temporal tables?) Or, you can use separate SQL statements for each action. But that solution is not as elegant as using the merge statement.
For this demonstration, I will expand on an earlier blog post that demonstrated how to get records from Excel into a SQL Server table. So, if you want to execute the steps in this demonstration, you first need to create the solution presented in that post.
The Excel solution was a bit simplistic because it just inserted new records into the database table each time the macro was executed. To handle not only insert, but also updates and deletes, we need to use two steps. The first step is to put the records into a new staging table instead of directly into the destination table Customers. Create the staging table with:
CREATE TABLE dbo.CustomersStage ( CustomerId VARCHAR(10) NOT NULL CONSTRAINT CustomerStageId_PK PRIMARY KEY CLUSTERED ,FirstName VARCHAR(100) NOT NULL ,LastName VARCHAR(100) NOT NULL ) GO
This table will hold the new set of records, and the table Customers will hold the old records. Then, to do the merge from CustomersStage into Customers, we will use the following stored procedure:
CREATE PROCEDURE dbo.MergeCustomers AS SET NOCOUNT ON MERGE dbo.Customers AS Trgt USING dbo.CustomersStage Src ON Trgt.CustomerId = Src.CustomerId WHEN NOT MATCHED BY TARGET THEN INSERT (CustomerId, FirstName, LastName) VALUES (Src.CustomerId, Src.FirstName, Src.LastName) WHEN MATCHED AND ( ISNULL(Trgt.FirstName, '') <> ISNULL(Src.FirstName, '') OR ISNULL(Trgt.LastName, '') <> ISNULL(Src.LastName, '') ) THEN UPDATE SET FirstName = Src.FirstName, LastName = Src.LastName WHEN NOT MATCHED BY SOURCE THEN DELETE; DELETE FROM dbo.CustomersStage GO
Here is how this procedure works.
MERGE dbo.Customers AS Trgt USING dbo.CustomersStage Src ON Trgt.CustomerId = Src.CustomerId
Directly after the MERGE statement, we state the table we want to merge into, which is Customers. The alias Trgt is used to make the rest of the code clear whether we reference the target or the source. They often have the same names, so using alias is a good practice. The USING statement declares the source that we want to merge from. The alias Src is used. The ON statement shows the criteria for a match between the source and the target. In this case the matching is done on the column CustomerId, which is the primary key.
WHEN NOT MATCHED BY TARGET…
WHEN NOT MATCHED BY TARGET THEN INSERT (CustomerId, FirstName, LastName) VALUES (Src.CustomerId, Src.FirstName, Src.LastName)
If the CustomerId does not exist in the target, the record from the source is simply inserted.
WHEN MATCHED AND…
WHEN MATCHED AND ( ISNULL(Trgt.FirstName, '') <> ISNULL(Src.FirstName, '') OR ISNULL(Trgt.LastName, '') <> ISNULL(Src.LastName, '') ) THEN UPDATE SET FirstName = Src.FirstName, LastName = Src.LastName
If there is a match between the source and the target we want to update the record in the target with new data from the source. But only if there actually is a change, which is handled by the code after the AND statement.
WHEN NOT MATCHED BY SOURCE…
WHEN NOT MATCHED BY SOURCE THEN DELETE;
The last case in the merge statement covers the case when the record is removed from the source. This means we want to delete it from the target as well.
The last step in the procedure is to delete all records from the staging table. This way, when the Excel macro runs the next time, only new records from Excel will be in that table.
So. From the original solution where we simply inserted all records from Excel each time the macro was run, we have added a staging table and a stored procedure. Now we need to adjust the Excel macro so that when we loop through the rows we add them to the table CustomersStage instead. Finally, once the looping is done, we need to call the procedure MergeCustomers. The new script looks like this:
Sub Button1_Click() Dim conn As New ADODB.Connection Dim iRowNo As Integer Dim sCustomerId, sFirstName, sLastName As String With Sheets("Sheet1") 'Open a connection to SQL Server conn.Open "Provider=SQLOLEDB;Data Source=MSI\SQL2016;Initial Catalog=ExcelDemo;Integrated Security=SSPI;" 'Skip the header row iRowNo = 2 'Loop until empty cell in CustomerId Do Until .Cells(iRowNo, 1) = "" sCustomerId = .Cells(iRowNo, 1) sFirstName = .Cells(iRowNo, 2) sLastName = .Cells(iRowNo, 3) 'Generate and execute sql statement to import the excel rows to SQL Server table conn.Execute "insert into dbo.CustomersStage (CustomerId, FirstName, LastName) values ('" & sCustomerId & "', '" & sFirstName & "', '" & sLastName & "')" iRowNo = iRowNo + 1 Loop conn.Execute "EXEC dbo.MergeCustomers" MsgBox "Customers imported." conn.Close Set conn = Nothing End With End Sub
Row 22 is changed, and row 27 is new compared to the original script.
Now you can add, change and delete records in Excel, and all those actions will be reflected in the database table when you execute the macro.
Be sure to change the data source in row 10 above from “MSI\SQL2016” to your own instance name.