ELK Stack Setup in Azure to Fetch Data From EventHub

Reveation Labs
7 min readOct 14, 2020

Prerequisites

1. Basic knowledge about the ELK(Elastic Search, Logstash, Kibana).

2. Familiars with Azure Portal & you must have Azure Account

Basic Intro About Used Services

Elasticsearch is a real-time, distributed storage, search, and analytics engine. It can be used for many purposes, but one context where it excels is indexing streams of semi-structured data, such as logs or decoded network packets.

Logstash is an open-source data collection engine with real-time pipelining capabilities. Logstash can dynamically unify data from disparate sources and normalize the data into destinations of your choice. Cleanse and democratize all your data for diverse advanced downstream analytics and visualization use cases.

Kibana is an open-source analytics and visualization platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform advanced data analysis and visualize your data in a variety of charts, tables, and maps.

Azure Event Hubs is a big data streaming platform and event ingestion service. It can receive and process millions of events per second. Data sent to an event hub can be transformed and stored by using any real-time analytics provider or batching/storage adapters.

Let’s Start!!!

1.Login to the Azure portal using the mentioned link. https://portal.azure.com/

2. In the portal, search bar type “Elasticseach(Self-Managed)” and find in market place section & click it.

3. On Elasticsearch Page click on the create button. It would redirect to the configuration page for creating an Elasticsearch cluster, Kibana & Logstash setup.

4. Basic Section:

Subscription : <select your subscription>(ex. subscription-1)

Resource group: <select your resource group>(ex: elk-rg)If not exist then create a new one.

Region: <Select your region for this deployment>(ex: South Central US)

Username: <mention username for login in elk virtual machine>(ex:elkadmin)

Authentication Type: <Select Password>(If you want you can go with SSH Public Key)

Password: <Enter Strong Password for login in elk virtual machine>(ex:ElK$Set!78%!!1)

Confirm Password: <Enter the same password as above step>

=> Click on Next: Cluster Settings

5. Cluster Settings:

Elasticsearch Version: <Select Latest Version>(v7.9.0)

Cluster name: <Enter Cluster Name>(ex: elk-cluster)

Virtual network: <Default as it is>(You can create or select existing one)

Elasticsearch node subnet:<Default as it is>(You can create or select existing one)

=> Click on Next: Nodes Configuration

6. Nodes Configuration:

Hostname prefix:(The prefix to use for hostnames when naming virtual machines in the cluster. Hostnames are used for resolution of master nodes so if you are deploying a cluster into an existing virtual network containing an existing Elasticsearch cluster, be sure to set this to a unique prefix, to differentiate the hostnames of this cluster from an existing cluster)(ex: elk)

=> For Data nodes section

Number of data nodes: 3

Data node VM size: DS1 v2(1 vcpu, 3.5GM memory)

Data nodes are master eligible: Allow data nodes to be master eligible, setting this to Yes will no longer deploy the 3 dedicated master nodes. Select yes.

=> Data node disks

Number of managed disks per data node:1

Size of each managed disk:32GiB

Type of managed disks: The storage type of managed disks. The default will be Premium disks for VMs that support Premium disks and Standard disks for those that do not. Choose “Standard disks”.

=>Master nodes

Master node VM size:DS1 v2(1 vcpu, 3.5GM memory)

Client nodes(optional):0

=> Choose an option based on your load and requirements.

=> click Next: Nodes Configuration

7. Kibana & Logstash

=> Kibana

Install Kibana: yes

Kibana VM size: Standard A2 v2(2 vcpu,4GB memory)

=> Logstash

Install Logstash: Yes

Number of Logstash VMs: 1

Logstash VM size: Standard DS1 v2(1 vcpu,3.5 GB memory)

Logstash config file: skip it now we will add it manually.

Additional Logstash plugins: logstash-input-azure_event_hubs

=>External Access

Use a jump box: no(A jump box allows you to connect to your cluster from a public access point like SSH. This is usually not necessary if Kibana is installed since Kibana itself acts as a jump box.)

Load balancer type: External(Choose whether the load balancer should be public-facing (external) or internal.).

=>click on Next: Security

8. Security:

=>In this section set a password for all the built-in users of ELK Stack.

=> click on Next: Certificates

9. Certificates:

=> In this section you can set up the Certificate for the HTTP and TLS

=> If you want you can set up otherwise skip this as default

=> Click on Next:Review + Create

10. Review + Create:

=> Wait for the azure to validate the details and after a click on create.

=> Wait some time for deployment succeeded.

***Let’s Create Event Hub namespace and event hub In azure***

11. Create EventHub Namespace, Event hub & Consumer Group:

=> Go to the below link and create event namespace and event hub

https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-create

=> After that go to that event hub and create the consumer group. From the below image.

=> Go to that created event hub and copy the “Connection string–primary key”

***Now you are ready with the elastic cluster, Logstash, and kibana in a virtual machine running in the Azure environment. Let’s configure the event hub & logstash for running almost real-time logs fetch pipeline and visualize in the kibana dashboard.***

11. Go to the resource group you used for this deployment

=> In the portal, search bar type “Resource groups” and click it.

=> Click on your resource group which you choose or create at the time of the creation of the ELK service in azure. (From step 4)

12. SSH into logstash virtual machine using kibana virtual machine

=> Find the kibana virtual machine and click on it.

=> You will find the public Ip address of the kibana in the overview section

=> Open your local machine terminal and ssh into kibana using VM.

=> Command: ssh <admin>@<public IP of kibana> (ex: ssh admin@255.255.255.255).

=>admin is the username from step 4. For the first time ask to add the host in the machine so type yes and enter the password. You should be in the kibana virtual machine.

=> In that kibana SSH session login into a logstash virtual machine using the same step as kibana.(ex: ssh <admin>@<private IP of the logstash> ). You will find the private in the logstash virtual machine in the overview section of logstash vm.

=> Now you finally ssh into the logstash VM.

13. Run the pipeline in Logstash virtual machine

=>Go to the folder using this command “cd /etc/logstash/conf.d/”

=> In that folder create the file logstash.conf and add the below content to the file.

input {

azure_event_hubs{

event_hub_connections => [“<event-hub-connection-string>”]

threads => 16

decorate_events => true

consumer_group => “<event-hub-consumer-group>”

initial_position => “end”

storage_connection => “<storage-account-connection-string>”

storage_container => “<storage-account-name>”

}

}

## Add your filters / logstash plugins configuration here

filter{

json {

source => “message”

remove_field => “message”

}

if event.get(‘[payload][op]’) != ‘d’ then

event.get(‘[payload][after]’).each {|k, v|

event.set(k,v)

}

end

event.set(‘op’, event.get(‘[payload][op]’).downcase)

}

mutate {

remove_field => [“schema”,”payload”]

}

}

output {

stdout { codec => rubydebug }

if [op] != “d” {

elasticsearch {

hosts => [“<elasticsearch hosts>:9200”]

index => “sql-server-%{+YYYY-MM-dd}”

user => “elastic”

password => “<password for built in elastic user>”

sniffing => “true”

}

}

}

=> This file is a filter log that comes from the SQL server to Event-Hub.

=> Change the following parameter in file.

event_hub_connections : your event-hub connections string(from step 11 )

consumer_group: your event hub consumer group (from step 11)

storage_connection: storage account connection string(create it if not exists)

storage_container: name of the storage account from the storage container

hosts: hosts of the elastic search(you can find in internal load balancing)

user: built-in user ‘elastic’

password: password for built-in ‘elastic’ user

=> Make sure you change the comma(“”) to the normal comma of the editor when you add content to the file.

=>Run the below command to start the pipeline.

sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf

Congratulations, you have successfully created the real-time pipeline from the event-hub to ELK Stack.

--

--

Reveation Labs

We are an established software development Company in USA, dealing in blockchain, custom & b2b ecommerce web development, Web 3.0 - https://www.reveation.io/