Replication

Validating Replicated Data

Problems encountered during replication often occur because data at the Subscriber is not in synchronization with data at the Publisher. Microsoft® SQL Server™ 2000 replication can validate the replicated data at a Subscriber as the replication process is occurring to ensure that data at the Subscriber matches data at the Publisher.

You do not need to stop updates to the Publisher and wait for the Subscriber to become fully synchronized before testing that data has been received and applied correctly. You can validate the data in snapshot replication, transactional replication, or merge replication. Validation can be performed for specific subscriptions or for all subscriptions to a publication.

How Inline Data Validation Works

SQL Server validates data by calculating a rowcount and/or a checksum at the Publisher and then comparing those values to the rowcount and/or checksum calculated at the Subscriber. One value is calculated for the entire publication table and one value is calculated for the entire subscription table, but data in text or image columns is not included in the calculations.

While the calculations are performed, shared locks are placed temporarily on tables for which rowcounts or checksums are being run, but the calculations are completed quickly and the shared locks removed, usually in a matter of seconds.

When validating replicated data, consider the following:

Validating Replicated Data for Transactional Replication

Validation can be performed on transactional replication, subscriptions that use immediate updating or queued updating, and on horizontal and vertical partitions of data.

You can choose any of the following methods for validation:

You can configure validation using SQL Server Enterprise Manager or Transact-SQL system stored procedures. Regardless of which you use, when you run validation, stored procedures are executed at the Publisher. The stored procedure sp_publication_validation calls sp_article_validation for each article that is being validated, and sp_article_validation calls sp_table_validation for each table, which then generates the rowcount or checksum calculations. The sp_table_validation command is posted as a replication command to the Subscriber using the Log Reader Agent and Distribution Agent, and the calculations are then made at the Subscriber.

Note  Subscribers running SQL Server 6.5 can use rowcount only validation, but not checksum validation. You can validate based on a binary checksum calculation if Subscribers are running SQL Server 2000.

You can validate replicated data on a schedule by creating a Transact-SQL job that calls sp_publication_validation or sp_article_validation.

Unless you are a member of the sysadmin or db_owner roles, you must have SELECT permissions on all columns of the base table used in the article (even if the article is vertically partitioned) in order to execute sp_publication_validation.

Validation with Checksums

When checksums are used, 32-bit redundancy check (CRC) occurs on a column-by-column basis rather than a CRC on the physical row on the data page. This allows the columns with the table to be in any order physically on the data page, but still compute to the same CRC for the row. Checksum validation can be used when there are row (horizontal) or column (vertical) filters on the publication. Because checksums can require large amounts of processor resources when validating a large data set, you may want to schedule validation to occur when there is the least activity on the servers used in replication.

Subscribers running SQL Server 7.0 use the checksum routines released in SQL Server 7.0, which generate CRC values that are different than those generated with SQL Server 2000. The checksum routines released in SQL Server 7.0 cannot validate vertical partitions, or logical table structures where column offsets differ (due to ALTER TABLE statements that DROP and ADD columns).

Setting the Rowcount_only Parameter

The @rowcount_only parameter is a smallint and accepts the following values.

Value Description
0 Execute checksum functionality released with SQL Server 7.0.
1 (Default) Execute a rowcount check only.
2 Execute checksum functionality released with SQL Server 2000.

Because Subscribers running SQL Server 7.0 will use this parameter as a bit data type, not a smallint, SQL Server will interpret the parameter as 'ON'. Setting the parameter to a value of 2 with a Subscriber running SQL Server 7.0 will result in a rowcount only validation at the Subscriber. If you need to run a checksum validation for a Subscriber running SQL Server 7.0, use the value of 0 for this parameter. Subscribers running SQL Server 2000 could use the same value (0), but the checksum functionality would have the SQL Server 7.0 limitations.

To validate transactional data using SQL Server Enterprise Manager

  1. At the Distributor, expand Replication Monitor, expand Publishers, and then expand a specific Publisher.

  2. Right-click a transactional publication, and then click Validate subscriptions.

  3. Choose whether you want to validate all subscriptions or just specific subscriptions, and if you want to validate specific subscriptions, select those in the text box.

  4. To choose the type of validation, click Validation Options.

  5. Choose whether you want to compute a fast rowcount based on cached table information, compute an actual row count by querying the tables directly, or compute a fast row count and if differences are found, compute an actual row count.

  6. You can also choose to enable Compare checksums to validate data, a binary checksum (if the Subscriber is running SQL Server 2000), and you can choose to stop the Distribution Agent after the validation has completed.

To validate transactional data using Transact-SQL system stored procedures

Validation and Immediate Updating

When using inline publication validation (sp_publication_validation) on immediate updating subscriptions, there is a period of time when a change on the Subscriber will cause the publication validation to fail. This occurs when a data change is made on the Subscriber after a publication validation has been run on the Publisher, but before the publication validation can be performed on the Subscriber.

With transactional replication (without updatable subscriptions), changes can be made only at the Publisher, so changes made to the Publisher after sp_publication_validation has been executed will be applied at the Subscriber after the validation is run on the Subscriber. 

However, when using immediate updating subscriptions, data modifications can be made at the Subscriber. Any changes made at the Subscriber after validation was run on the Publisher are reflected immediately at the Subscriber. Validation will fail because the checksum and rowcount calculations were based on data in the publication table before changes were made at the Subscriber. To avoid this, stop all data modifications at the Subscriber during the validation process.

Considerations when Validating Replicated Data for Transactional Replication

The following are validation restrictions when using validation for transactional replication:

Validation Failure and Alerts

If validation between data at the Publisher and data at the Subscriber fails, you can configure replication alerts to notify you of the failure (with a message sent through e-mail or to a pager) and you can have the subscriptions reinitialized automatically.

To configure automatic reinitialization of subscriptions that fail validation

  1. At the Distributor, expand Replication Monitor, click Replication Alerts, right-click the Replication: Subscription has failed data validation alert, and then click Properties.

  2. On the General tab, select the Enabled check box.

  3. On the Response tab, select Execute job, and then in the drop down box, click Reinitialize subscriptions having data validation failures.

  4. To send a reinitialize confirmation message to the event log, right-click the Replication: Subscription reinitialized after validation failure alert, and click Properties.

  5. On the General tab, select the Enabled check box.
Validating Replicated Data for Merge Replication

Using SQL Server Enterprise Manager, you can choose to validate all subscriptions to a merge publication. Using Transact-SQL system stored procedures, you can validate all subscriptions to a merge publication or specified subscriptions.

You can choose any of the following methods for validation:

To request validation of replicated data at a merge Subscriber, you can use:

When validating merge replication, another option is to validate data, and if data is not converged, to conduct a partial reinitialization of the subscription. This partial reinitialization will return the Subscriber back to a previous state when data was in synchronization. Using the Validate and Resynchronize Subscription option in SQL Server Enterprise Manager or sp_resyncmergesubscription, you can resynchronize a merge subscription to a known validation state that you specify. This allows you to force convergence or synchronize the subscription database to a specific point in time, such as the last time there was a successful validation, or to a specified date. When resynchronizing a subscription using this method, the snapshot is not reapplied.

To validate merge data using SQL Server Enterprise Manager

  1. At the Distributor, expand Replication Monitor, expand Publishers, and then expand a specific Publisher.

  2. Right-click a merge publication, and then click Validate all subscriptions.

  3. Choose whether you want to validate replicated data using rowcounts only, rowcounts and checksums, or rowcounts and comparing binary checksums (all Subscribers must be running SQL Server 2000 to use this option). Validation will occur the next time the Merge Agent runs with results displayed in Replication Monitor.

To validate and resynchronize subscriptions

To validate merge data using Transact-SQL system stored procedures

To validate merge data using a Merge Agent command line parameter

  1. At the Distributor, expand Replication Monitor, click the Merge Agents folder, right-click an agent, and then click Agent Properties.

  2. On the Steps tab, double click the Run agent step.

  3. In the command text box, type –validate and specify 1 for rowcount-only validation, 2 for rowcount and checksum validation. Validation will occur the next time the Merge Agent runs and success or failure messages are logged in the Merge Agent History.

  4. If you want to schedule validation, set the –ValidateInterval parameter on the Merge Agent command line to the number of minutes when you want the validation to occur (the default is to validate every 60 minutes).