by CData Software | November 22, 2022

A Short Guide on Data Replication Types and Use Cases: History Replication

cdata virtuality

The last instalment in our series on replication. In this blog post we will tackle history replication and its use cases. Until now we looked at CopyOver ReplicationIncremental and Upsert Replication and Batch Replication.

What is history replication?

The purpose of history replication is to historize the data changed in the source table to the destination table with added validity time frame. This replication type implements the Slowly Changing Dimensions Type 2 process known from Data Warehousing.

It is very useful in situations where we need to keep track of changes made to data, such as when you have to deal with financial transactions and need to have a clear view of all the activities involved. Or when you need to have a history of the data for security audits. Complete, incremental, or batch replication will always yield a table which contains all available data, so they aren’t really fit for situations where you need to track changes.

Incremental or batch updates may keep ‘old’ data which is not present in the data source anymore.  Generally, changes of data cannot be tracked, and the user is always forced to work with the most current information.

That is why if you need to know when and how the data has been changed, history update is the right replication type to use. There are several ways to implement this type of replication.

History replication on CData Virtuality 

In our platform, history replication is based on performing a kind of versioning on the data records by adding two columns to the storage table where the data is replicated to. These columns, called fromtimestamp and totimestamp, provide information on the period when the given record was the current version. Here’s how this works:

  • Select columns of the source object (either table or view) which will be considered the key columns. Please keep in mind that the fields should be chosen in such a way that duplicates (entire rows with the same values in all fields) cannot occur. Usually this is the primary key or keys of the source. 
  • Select columns to be tracked for updates when the replication job runs.

Interested in seeing how you can get started?

The CData Virtuality is trusted by businesses around the world to help them harness the power of their data. Get a free trial to experience its transformative power yourself.