Saturday, November 8, 2008

Using SQL 2008 Change Data Capture (CDC)

SQL 2008 has a new feature called Change Data Capture (CDC). When enabled for a table it stores before and after values of the tracked columns in a change tracking table.

1. Enabling CDC for a database

To capture data changes for a table we need to first enable change tracking at database level. Execute the following script to enable CDC for the database.

use 'YourDBName'

go

exec sys.sp_cdc_enable_db

2. Setting CDC for the table

Use sys.sp_cdc_enable_table for a table to set CDC. By default, all
columns are tracked. We can strict CDC to specific columns by specifying column names for parameter ‘@captured_column_list’

exec sys.sp_cdc_enable_table
@source_schema = 'dbo'
,@source_name = 'Person'
,@role_name = 'rptrole'
,@capture_instance = 'Person'
/* This will create a change tracking table cdc.dbo_Person_CT */
,@supports_net_changes = 1
/* Indicates if support for quering net changes are allowed or not
allow = 1 */
,@index_name = 'PK_Person_PersonID'
/* change tracking requires a unique index to track changes against*/
,@captured_column_list
= 'Address1,Address2,City,State,Zip,PhoneNo,PersonID'

/* Changes will be tracked for columns mentioned in this parameter.
Default is all*/
,@filegroup_name = 'cdc'
/*file group where cdc objects will be created It’s a good practice to
create separate filegroup for CDC.Place the filegroup in a different
Drive to minimize impact on I/O
*/
-- ,@partition_switch = 'partition_switch'

/* Indicates if switch partition command is allowed or not.
applicable to patitioned tables only*/

3. Retrieving Change


Change data for the above example would be stored in cdc.dbo_Person_CT.It can be used in conjunction with cdc.lsn_time_mapping table to retrieve data for a time period. CDC provides functions for change retrievel as well.

Get Start LSN –

sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', @begin_time)

Get End LSN -

sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time)

Get net changes for the LSN range (shows only final content of a row for the range)

SELECT * FROM cdc.fn_cdc_get_net_changes_dbo_Person(Start LSN, End LSN,'all');

Get all changes for the LSN range (shows all changes for a row in the LSN range)

SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_Person(Start LSN, End LSN,'all');

4. CDC Jobs


CDC creates a capture and cleanup job. Use the below SQL to see job configuration.

exec sys.sp_cdc_help_jobs

By default cleanup job is configured to retain up to 72 hrs changes and capture job is set to run continuously.

We can use sys.sp_cdc_change_job to modify the configuration of cdc jobs.