Streaming Events from SQL Server to Event Hub in Azure using Debezium

Reveation Labs
6 min readSep 8, 2020

Streaming SQL Server CDC to Azure Event Hub

In the following article, we will see how to:

  • Install Java
  • Create Azure Event Hub namespace
  • Setup Kafka Connect with Debezium SQL Server Connector
  • Setup SQL Server Management Studio and enable CDC (Change Data Capture) on table
  • Stream Event from SQL Server to Azure Event Hub

Java Installation

Make sure you have java in your machine.

Run the command, java -version in your terminal, to check for the same.

Java Installation Version

To download and setup jre, follow the below link and download jre for your system.

Run the .exe file and follow the installation setup. Once the installation is complete, copy the path to the java bin folder into your system environment variable path. To do so, search for Environment Variables from Start Menu, and open it. Select the Environment Variables option on bottom right, click on path and edit. Add the path, and click OK. Once done, you may restart your system.

Setting the Environment Variable
Setting the Environment Variable
Setting the Environment Variable

Creating an Event Hub Namespace:

Follow the above link from Microsoft’s Official Documentation to create an Event Hub Namespace.

https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-create

Setting up Kafka Connect and Debezium

What is Kafka Connect?

Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other data systems. It makes it simple to quickly define connectors that move large data sets into and out of Kafka.

What is Debezium?

Debezium is built on top of Apache Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Start it up, point it at your databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps commit to your databases.

Kafka libs:

https://archive.apache.org/dist/kafka/2.2.1/kafka_2.12-2.2.1.tgz

Extract the files from above download into a folder named kafka.

Debezium SQL Server connector libs:

https://repo1.maven.org/maven2/io/debezium/debezium-connector-sqlserver/0.10.0.Final/debezium-connector-sqlserver-0.10.0.Final-plugin.tar.gz

Extract the files from the above download, copy the .jar files from debezium-connector-sqlserver into kafka\kafka_2.12–2.2.1\libs folder.

Configure Kafka connect distributed properties:

Create a file(or replace the file) named connect-distributed.properties inside kafka\kafka_2.12–2.2.1\config and copy the following contents into that file.

bootstrap.servers=namespace.servicebus.windows.net:9093 #add event hub namespace 
group.id=connect-cluster-group
# connect internal topic names, auto-created if not exists
config.storage.topic=connect-cluster-configs
offset.storage.topic=connect-cluster-offsets
status.storage.topic=connect-cluster-status
# internal topic replication factors - auto 3x replication in Azure Storage
config.storage.replication.factor=1
offset.storage.replication.factor=1
status.storage.replication.factor=1
rest.advertised.host.name=connect
offset.flush.interval.ms=10000
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
# required EH Kafka security settings
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username= "$ConnectionString" password= "PASSWORD";
producer.security.protocol=SASL_SSL
producer.sasl.mechanism=PLAIN
producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username= "$ConnectionString" password="PASSWORD";
consumer.security.protocol=SASL_SSL
consumer.sasl.mechanism=PLAIN
consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username= "$ConnectionString" password="PASSWORD";
plugin.path=E:\kafka\kafka_2.12–2.2.1\libs # path to the libs directory within the Kafka release

Copy the Host Name from Azure Event Hub Namespace and copy it to bootstrap.servers value.

Azure Event Hub Namespace

Go to Shared access policies, select your policy and copy the Connection string primary key and replace “PASSWORD” with it in connect-distributed.properties. The connection string looks something like:

“Endpoint=sb://namespace.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=mkengtewngwngwk/gkewekbg=”

Connection String for Azure Event Hub Namespace

Also, replace plugins.path value with path to libs folder in kafka.

Save connect-distributed.properties inside config folder.

Open powershell, go to the directory where you have the kafka folder and run the following command:

./bin/windows/connect-distributed.bat ./config/connect-distributed.properties

Running Kafka Connect with Debezium SQL server connector.

Hurray.. Now we have kafka connect running. Moving forward, we’ll set-up SQL Server CDC, and configure it to post change events to Event Hub.

Setting up SSMS and enabling CDC (Change Data Capture) on Table

What is SQL Server?

By definition, Microsoft SQL Server is a relational database management system developed by Microsoft. It is a software product with the primary function of storing and retrieving data as requested by other software applications — which may run either on the same computer or on another computer across a network.

Downloading SQL Server

Follow the above link to download SQL Server 2019 Developer.

Once you download and install the SSMS, create a login, and provide it ‘sysAdmin’ role. Make sure that SQL server agent is running. If not, right click on it, and select Start.

SSMS SQL Server Agent

Also, assure that you have TCP/IP request enabled. To check for same, press Ctrl+R and run

compmgmt.msc

Go to SQL Server Network Configuration in SQL Server Configuration Manager and enable TCP/IP.

Enable TCP/IP protocol for SQL Server

Enabling CDC on table

Create a database and table and run the below script to enable CDC for your table:

  1. To enable CDC for your database, click new query in SSMS and run:
USE MyDB
GO
EXEC sys.sp_cdc_enable_db
GO

2. To enable CDC for your table:

USE MyDB 
GO
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'users',
@role_name = N'MyRole',
@supports_net_changes = 1
GO

You can provide any name to @role_name.

3. Check if CDC is enabled:

USE MyDB
GO
EXEC sys.sp_cdc_help_change_data_capture
GO

Now, create a file named sqlserverproperties.json. and copy the following contents to the file.

{
"name": "sql-server-connection",
"config": {
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"database.hostname": "localhost",
"database.port": "1433",
"database.user": "user",
"database.password": "user",
"database.dbname": "cdc",
"database.server.name": "userserver",
"table.whitelist": "dbo.users",
"database.history":"io.debezium.relational.history.MemoryDatabaseHistory"
}
}

Change database.user, database.password, database.dbname and table.whitelist with your SQL server settings and save the file to kafka folder.

Open bash, and run the following command to post your configurations.

curl -X POST -H “Content-Type: application/json” — data @sqlserverproperties.json http://localhost:8083/connectors

To check for status, run,

curl -s http://localhost:8083/connectors/sql-server-connection/status

Now that you have successfully setup everything, let’s make some changes to the table and look at the logs in Azure Event Hub. Go to Event Hubs, select userserver.dbo.users, Process Data and Explore.

Event Hubs
Process Event Hub Data

You can see the change events captured from your table in the input preview.

Events from your table

So, IT’S DONE!!

Photo by Konstantin Planinski on Unsplash

--

--

Reveation Labs

We are an established software development Company in USA, dealing in blockchain, custom & b2b ecommerce web development, Web 3.0 - https://www.reveation.io/