Posted in:

The Technology and the Functioning of Oracle

The Change Data Capture (CDC) is a feature of the Oracle database management system that tracks and monitors all changes made to the database so that specific subsequent action can be taken on them. The Oracle CDC is fundamentally software design patterns that are optimized for data identification, data delivery, and data integration for all changes made at a source database. It is also a great way to enable data integration across enterprises and enhance the quality and output performance of databases by speeding up data warehousing solutions. 

Among the key functions of the Oracle CDC (which will be studied in detail later in this post) is its ability as an efficient tool and a non-intrusive method to implement replication activities without causing any slowing down of performance levels. It can also migrate databases to the cloud without the necessity to shut down the source database or offload analytics queries to data warehouses. The main role of the CDC is to extract changes or incremental data generated after the last run and load it to a data warehouse. 

In a nutshell, Oracle CDC captures and preserves the state of the data in any database or data repository system with the whole mechanism taking place in a data warehouse environment. There are several ways that users can configure Oracle CDC, starting from physical storage to application logic or in single or many combinations of system layers. 

The Origin and Evolution of the Oracle Change Data Capture Technology

Oracle first launched the Change Data Capture with its 9i version. This feature was in-built into Oracle databases and was used mainly to track and record all changes to data made in user tables and databases. These records of changes were stored in change tables for later use in ETL-related applications to be formatted and loaded into data warehouses and other databases. The functioning of the CDC in the launch version was through triggers which proved to be a complex affair. This technology was deemed to be very invasive by users.  

In keeping with the feedback of this technology, a newer version of the Oracle CDC was introduced with the 10g version of this database management system. It turned out to be far less intrusive know-how and was named Oracle Streams by the company. Along with the in-built replication tool of Oracle, this CDC version leveraged the redo logs of the source database. It resulted in effectively detecting and moving change data to a target storage repository without affecting in any way the performance of the source database. 

Surprisingly though, despite it being a very user-friendly tool, Oracle discontinued Streams from the 12c version that no longer provided the required support to Oracle CDC. Users had no alternative but to either choose to use another Oracle replication or CDC tool or pay for the new Oracle GoldenGate that provided the CDC feature out of the box. 

The Modes of the Oracle Data Integrator Supporting CDC

The whole idea of the CDC revolves around the premise that the data at the source database is changed and the target database is required to take some action based on those changes. It is not necessary for the source and the target databases to be the same as Oracle CDC works equally well even if both are the same entities. This is because several CDC solutions can co-exist in the same system. 

The changes made at the source are principally identified by the Oracle Data Integrator which helps Oracle CDC to do so. The Oracle Data Integrator supports two modes which are:

  • Synchronous Mode: In this mode, triggers that are placed in the source database make sure that any changes to the data made are captured instantly. Each SQL statement carries out a DML (Data Manipulation Language) activity that consists of an insert, update, or delete task. The changed data that is captured is also an integral part of the transactions that are responsible for changing the data at the source. The Synchronous mode of the Oracle CDC is available with the Enterprise and the Standard editions of Oracle. 
  • Asynchronous Mode: The functioning of this mode is somewhat different from the previous one. Here, data is first transferred to the redo log files after which the changes to the data are captured after a SQL statement is taken through a DML action. Since the transaction is not captured as a component of what changed the data at the source table, the modified data does not have any effect on it.  The three modes in Asynchronous CDC are HotLog, Distributed HotLog, and AutoLog.  

In both cases, configuring the Oracle Data Integrator is an automated affair. 

The Benefits of Using Oracle CDC to Extract Databases

Why should Oracle CDC be used to extract databases and what are the benefits it brings to the table?

  • Staging data: Flat files are not required in Oracle CDC to stage data as it is placed in relational tables directly. In any other CDC solution, complete tables have to be moved into flat files. 
  • Optimized interface: An easy-to-publish and subscribed interface is provided by the Oracle CDC through DBMS_LOGMNR_CDC_PUBLISH and DBMS_LOGMNR_CDC_SUBSCRIBE packages.
  • Extraction of data: Any changes to the source database such as Insert, Update, and Delete activities are extracted immediately by Oracle CDC. Without CDC, Insert data is difficult to extract and not possible for Update and Delete as the data is no longer available in the table. 
  • Cost-effective: While GoldenGate provides a paid service to use Oracle CDC, it is still a cost-effective proposition if compared to the charges of third-party vendors for the same CDC solutions.  

Summing up, the Oracle CDC technology has automated migration and replication activities, doing away with the need for human intervention at any stage.