[Diagram] Oracle GoldenGate Process Architecture Mechanics | Classic vs MA

26ai_en

When handling Oracle GoldenGate (OGG) in practice, the most important thing is to accurately understand the “Data Flow” and the “Role of each process.”

Currently, we are in a transition period (19c) from the traditional Classic Architecture to the latest Microservices Architecture (MA), and a full migration period (23ai/26ai). If you do not understand the differences in process names and communication methods between the two, you will stumble in design and troubleshooting.

In this article, I will thoroughly explain the replication mechanics of Oracle GoldenGate with text diagrams, covering both Classic and MA.

Conclusion / Summary

  • Biggest difference between Classic and MA: Whether data transfer is handled by “Processes (Pump/Collector)” or “Services (Distribution/Receiver).”
  • Common concepts: The basic concepts of “Trail Files” for storing data and “Checkpoints” for progress management remain unchanged.
  • The world of 23ai/26ai: MA becomes the standard, and WebSocket/HTTPS communication and distributed configurations (Mesh) become the premise.

1. Common Concepts: Trail Files and Checkpoints

Before understanding the architecture, let’s grasp the “Data Container” and “Management Book” common to all OGG versions.

Trail File

A file that temporarily stores data extracted by OGG.

  • Local Trail: Extracted data stored on the Source side.
  • Remote Trail: Data transferred to the Target side.
  • Structure: A serialized binary composed of header information and the data body (Operation Code, Before/After Image).

Checkpoint

Pointer information that records “how far read (Read) and how far written (Write).”

  • Even if a process terminates abnormally, data integrity is maintained by resuming from the checkpoint position (At Least Once).

2. Classic Architecture Mechanics (~19c)

This is the command-line (ggsci) based configuration that has been used for many years. Processes start directly on the OS and communicate using a proprietary protocol.

Data Flow Diagram (Classic)

[ Source Server ]                                     [ Target Server ]
+---------------------+                               +---------------------+
|  Oracle DB (Redo)   |                               |  Oracle DB          |
+----------|----------+                               +----------^----------+
           | (LogMiner)                                          | (SQL)
           v                                                     |
+----------+----------+     (TCP/IP)      +-----------+     +----+----+
|  Extract Process    | ----------------> | Collector | --> | Replicat|
+----------+----------+     (Push)        +-----+-----+     +---------+
           |                                    |
           v                                    v
   [ Local Trail ]                      [ Remote Trail ]
   (./dirdat/la000000)                  (./dirdat/ra000000)

Note: In actual operations, the standard configuration inserts a “Data Pump Extract” between the Extract and the Collector.

Roles of Major Processes

Process NameRoleCharacteristics
ManagerThe Boss. Manages startup, shutdown, monitoring, and logs for all processes.Uses port 7809. If this goes down, the entire system doesn’t stop, but it becomes unmanageable.
ExtractExtraction. Captures change data from REDO logs and writes to the Trail.Coordinates with LogMiner inside the DB in Integrated Capture mode.
Data PumpTransfer. Reads the Local Trail and sends it to the Target (Optional but recommended).Acts as a buffer during network outages. It is essentially a type of Extract process.
CollectorReception. Receives data on the Target side and writes to the Remote Trail.A background process dynamically started by the Manager.
ReplicatApply. Reads the Remote Trail, converts it to SQL, and applies it to the Target DB.Parallel Replicat, which allows parallel processing, is mainstream from 19c onwards.

3. Microservices Architecture (MA) Mechanics (19c/23ai/26ai)

This is a modern architecture based on REST APIs. Each function operates as an independent “Microservice” and communicates via HTTPS / WebSocket. This is the standard for Oracle DB from 23ai onwards.

Data Flow Diagram (MA)

Note that the Classic “Data Pump” and “Collector” have been replaced by highly functional Services respectively.

[ Source Deployment (MA) ]                            [ Target Deployment (MA) ]
+-------------------------+                           +-------------------------+
| Admin Service (Extract) |                           | Admin Service (Replicat)|
+-----------|-------------+                           +-----------^-------------+
            | (LogMiner)                                          | (SQL)
            v                                                     |
    [ Local Trail ]                                       [ Remote Trail ]
            ^                                                     ^
            | (Read)                                              | (Write)
+-----------+-------------+       (WSS/HTTPS)         +-----------+-------------+
| Distribution Service    | ========================> |   Receiver Service      |
| (Path: source->target)  |                           |                         |
+-------------------------+                           +-------------------------+

Roles of Major Services (The 5 Services)

MA consists of the following 5 major services.

  1. Service Manager (The Gatekeeper)
    • Nginx-based reverse proxy and process supervisor. The entry point for everything.
  2. Administration Service (Management & Control)
    • Handles creation, configuration, and control of Extract and Replicat processes. Functions of the Classic Manager + role of ggsci.
  3. Distribution Service (Transmission)
    • Equivalent to Classic Data Pump (but more functional).
    • Reads Trail files and transfers them to the Receiver Service on the Target. Configured in units called “Path”.
    • Also handles protocol conversion (OGG proprietary → WebSocket/HTTPS) and filtering.
  4. Receiver Service (Reception)
    • Equivalent to Classic Collector.
    • Receives data from the Distribution Service and writes to Remote Trail files. Operates passively by default.
  5. Performance Metrics Service (Monitoring)
    • Collects performance data of each process and visualizes it (provides JSON).

4. Process Comparison Table (Classic vs MA)

FeatureClassic ArchitectureMicroservices Architecture (MA)
Overall ManagementManager Process (mgr)Service Manager
Process Controlggsci commandAdministration Service (WebUI / REST / adminclient)
Data ExtractionExtract ProcessExtract Process (Runs under Admin Service)
Data TransferData Pump Extract ProcessDistribution Service (Distribution Path)
Data ReceptionCollector (Dynamic startup)Receiver Service
Configuration FilesParameter file (.prm)JSON + Parameter file (Managed via WebUI)
Comm. ProtocolTCP/IP (Proprietary)HTTPS / WebSocket / TCP

Evolution Points in 23ai / 26ai

  • Classic Deprecation: OGG 23ai for Oracle Database no longer includes the Classic binaries.
  • Importance of Distribution Path: The use of “Path” configurable via GUI is strongly recommended, rather than the traditional Pump process (writing .prm method).
  • Certificate Management: In MA, SSL/TLS certificates are built-in standard for communication between services, significantly improving security strength (Wallet is mandatory).

5. Detailed Technology: Integrated Capture and Replicat

Regardless of architecture, the internal behavior of “Capture (Extract)” and “Apply (Replicat)”, which are the contact points with the DB, is also important.

A. Inside Integrated Capture (Integrated Capture)

Instead of reading REDO log files from the OS, it connects to the LogMiner Server inside the DB instance and receives LCRs (Logical Change Records).

[ Oracle Database (Source) ]
+-------------------------------------------------------+
|  Redo Log / Archive Log                               |
|      ^                                                |
|      | (Read)                                         |
|  [ LogMiner Server ] <--- (Memory Queue) ---+         |
|      | (Sent as LCR)                        |         |
+------+--------------------------------------+---------+
       |                                      | (Attach)
       v                                      |
[ OGG Extract Process ] ----------------------+
  • Benefit: Automatically merges logs from each node in RAC. Can also capture compressed tables and encrypted data by decrypting them.

B. Integrated / Parallel Replicat (Apply)

Instead of executing SQL row by row, it analyzes dependencies and applies them in parallel.

  1. Integrated Replicat (IR):
    • Passes data to the Inbound Server process within the DB, and the application processing is done on the DB side.
  2. Parallel Replicat (PR) – Recommended:
    • OGG side performs dependency calculation and parallelization, applying SQL at high speed via multiple DB sessions.
    • From 19c onwards, the use of Parallel Replicat is recommended unless there is a specific reason not to (Integrated mode).

6. FAQ (Frequently Asked Questions)

Here is a summary of common questions when designing architecture or operating in practice.

Q1. Is a configuration with Classic (Source) and MA (Target) possible?

A. Yes, it is possible. This is a very common configuration during the transition period.

  • Classic (Source) → MA (Target): The Classic Pump process can send data to the MA Receiver Service. Simply specifying RMTHOST <MA-Host>, MGRPORT <Receiver-Port> in the Pump parameter automatically handles protocol adjustment.
  • MA (Source) → Classic (Target): The MA Distribution Service can send data to the Classic Collector.

Q2. If I change a parameter (.prm), is it reflected immediately?

A. No, a process restart is required. If you edit the parameter file for Extract or Replicat, the changes will not be applied until you STOP and then START the process. Similarly, a restart is required if changed from the WebUI in an MA environment.

Q3. Are Trail files automatically deleted?

A. They are not deleted unless configured, and will continue to consume disk space. You must explicitly set up purging to delete old Trail files.

  • Classic: Write PURGEOLDEXTRACTS in the Manager parameter.
  • MA: Schedule “Purge tasks” from the Admin Service settings screen.

7. Summary: Practical Check Points

When designing systems or responding to failures, please check the architecture from the following perspectives.

  1. What is the transfer path?
    • If Classic, look at the .prm of the Pump process.
    • If MA, look at the “Path” settings in the Distribution Service on the WebUI.
  2. Checkpoint Lag
    • Which is increasing: Lag at Chkpt (Processing delay) or Time Since Chkpt (No communication time)?
  3. Network Walls
    • In the case of MA, is HTTPS/WebSocket communication (port 443 or dedicated port) from Source to Target allowed by the Firewall?

I strongly recommend getting used to the MA data flow centered on the Distribution Service for the Oracle GoldenGate 23ai/26ai era.

[reference]
Oracle GoldenGate 23ai – Get Started

Note: This article explains concepts targeting Oracle Database 19c/23ai.

コメント

Copied title and URL