In this blog we will show you how to detect and forward changes made to a MySQL table and forward them as events into an Azure Event Hub using ENZO DataZen. This can be achieved directly with DataZen and a subscription to Microsoft Azure.
In this example, we will configure DataZen to read data from a source table in a MySQL database, and forward both
the initial set of records, and ongoing changes detected. It should be noted that DataZen is able to automatically
identify and extract changes using a "micro-batch" pattern from any source system, including a MySQL table.
While this blog uses MySQL as the data source, using Oracle, Teradata or SQL Server as the source system is just as simple.
DataZen sends all available records from the source system as an initial load the first time the job runs;
once the data has been replicated the first time, DataZen automatically
detects changes over time and sends changes to the messaging platform.
The main advantage of using a messaging platform as a target system is to decouple entirely the replication engine from the target system itself.
We will be using a MySQL table as the source system, and will demonstrate how to setup a Direct
Job replication. Direct Jobs combine both a Job Reader and a Job Writer into a single
operation for simpler replication topologies.
To create a Direct Job, start DataZen, click on the desired Agent (at least one agent must be registered),
and select NEW -> Direct Data Sync Job from the context menu.
Enter a Job Name, and specify a connection string to the source system. You can click on the ellipses ...
to create a Central Connection String. Central Connection Strings allow you to create a single connection securely, give it a
friendly name, and reuse the connection across multiple jobs.
Once the connection string has been created, select it from the dropdown box.
Next, enter a valid SQL statement to read data from a MySQL table and click on preview. For example, the following command will return data from a sample database called Sakila:
SELECT * FROM Sakila.film
The statement specified will be wrapped by another SELECT operation so that only a few records are returned at a time.
Click on the Replication Settings tab and select a Key Column (in this example, we picked film_id).
You may select more than one field as the Key Column if needed. To propagate delete operations, check the
Identify and propagate deleted records.
Choose a replication schedule (as a Cron expression), and select a folder in which the Sync files will be created.
Choosing a Timestamp/Date Column speeds up the identification of modified records significantly. It is only available if the SQL command is a SELECT operation.
Click on the Target System tab and specify a target connection string. We will be using an
Azure Event Hub connection. Select the Messaging option for the Target System, and
click on the ellipses ... to create a new target connection. This brings up the Central Connection Strings
Select New -> New Messaging Endpoint..., choose Azure Event Hub, enter a friendly name and paste your Event Hub connection string.
Use the Azure Portal to create an Event Hub Namespace, and an Event Hub. You can use the Event Hub Namespace Shared Access Policy as the connection string. For testing purposes, you can use the default RootManageSharedAccessKey access policy. Locate the RootManageSharedAccessKey access policy and copy/paste its Connection String. Using a Shared Access Policy connection string instead of an Event Hub allows you to use different Event Hub targets later for upsert (an upsert operation is either an insert or an update operation) and delete operations.
Once you have created the Event Hub connection string, select it from the dropdown list. Under the Upsert Request tab, enter the target Event Hub Name where all Upsert operations will be sent to. Next, build a JSON payload by clicking on the Generate XML/JSON Payload link. Select all the available fields. A payload will be automatically generated.
Because the target system is not a relational database, certain options are not available.
If you wish to forward deleted records, select the Delete Request tab, provide the Event Hub name where Deleted events will be sent to, and build a JSON payload. Usually, sending the Key Column field is sufficient for deleted payloads.
Click OK to create the Direct Job.
If you have selected a Schedule (Cron Expression) the job will start immediately; otherwise you will need to click on the Start icon for the job. The example provided selects records from a table in MySQL and forwards them to an Azure Event Hub. The first time it runs, all the records will be sent as individual JSON documents; then as changes take place in the MySQL table, modified and deleted records will be sent to the same Event Hub as JSON documents as well.
This blog shows you how you can easily replicate database tables to a messaging bus, using MySQL as the source
and an Azure Event Hub as the target. DataZen makes it easy to extract Change Logs from any source system and
forward them to any other platform, including messaging platforms such as Azure Event Hub, RabbitMQ and AWS SQS.
To try DataZen at no charge, download the latest edition of DataZen on our website.
Integrate data from virtually any source system to any platform.
Support for High Watermarks and fully automated Change Data Capture on any source system.
Extract, enrich, and ingest data without writing a single line of code.
To learn more about configuration options, and to learn about the capabilities of DataZen, download the User Guide now.
© 2023 - Enzo Unified