Working with Azure Event Hubs using Kafka Connector for GeoEvent

3015
2
11-28-2022 01:35 PM
YujingWu
Esri Contributor
2 2 3,015

Introduction  

Azure Event Hubs is a big data streaming platform and event ingestion service. It can receive and process millions of events per second (see https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-about). Data sent to an event hub can be ingested by GeoEvent for real-time analytics and stored as features to be used within ArcGIS platforms. In addition, GeoEvent can output data to an event hub. For users who would like to leverage Azure Event Hubs with GeoEvent Server, this blog outlines the steps needed to create an Azure event hub and configure a GeoEvent input/output to work with it.  

Before we start, it is important to note that Azure Event Hubs is a subscription-based service with different subscription tiers and fees. GeoEvent utilizes its out-of-the-box Kafka connectors to ingest data from and send data to Azure Event Hubs instances. Thus, for the setup outlined in this article to work, the Azure subscription tier must be at least Standard tier which allows Kafka endpoint to be enabled. 

Configure Azure Event Hubs

Create a Resource Group  

To create an Azure event hub to receive data from or send data to in GeoEvent, we will first need a resource group. If there isn’t an existing resource group where you want to house the new event hub, follow the steps below to create a new one on your Azure portal. 

As seen in Figure 1 below, on the homepage of the Azure portal, click on the menu button on the top left corner and click on the Resource groups among the list of available items

YujingWu_0-1669641401721.pngFigure 1. Homepage of Azure Portal 

On the Resource groups page, click on the Create button on the top left corner as shown below in Figure 2.  

YujingWu_1-1669641401726.pngFigure 2. Resource groups page 

Fill out the form to create a new resource group. Select the subscription under which the resource group will be created. Give a name to your resource group and select an Azure location. After the form has been validated, you will be directed to the page shown in Figure 4 below. Click on the Create button on the bottom left corner to create the resource group. 

YujingWu_2-1669641401731.pngFigure 3. Fill out the form to create a new resource group 

YujingWu_3-1669641401734.pngFigure 4. Review and create the resource group  

Create an Azure Event Hubs Namespace 

The next step is to create an Azure Event Hubs namespace. First, navigate to the homepage of Azure Portal. Under Azure services, as shown in Figure 5 below, click on Create a resource. 

YujingWu_4-1669641401738.pngFigure 5. Click on Create a resource on the homepage of Azure Portal 

To find Azure Event Hubs, click on See more in All services under the search bar as shown in Figure 6. 

YujingWu_5-1669641401744.pngFigure 6. List of available resources  

After you reach the page of All services, find Analytics among the categories on the leftmost bar and click on it. Event Hubs should show up as one of the available resources to choose from on the right.  

YujingWu_6-1669641401750.pngFigure 7. Select Event Hub under Analytics 

After clicking on Event Hubs, you will be directed to the list of existing event hub namespaces. On the top left corner, click on Create. 

YujingWu_7-1669641401753.pngFigure 8. List of existing event hub namespaces 

Fill out the form to create a new namespace. For the Resource group field, select the resource group just created. For the pricing tier, Standard or above needs to be selected to work with GeoEvent. After the form has been completed, review and create the namespace.  

YujingWu_8-1669641401756.pngFigure 9. Fill out the form to create a new event hub namespace 

Create an Azure Event Hubs Instance 

After the namespace is created, navigate to it and click on + Event Hub as shown in Figure 10 below.  

YujingWu_9-1669641401762.pngFigure 10. The homepage of the newly created event hub namespace  

Fill out the form to create an event hub instance. This is the last step required to create an event hub instance. A more detailed walkthrough of this process can also be found in the official documentation of Azure Event Hubs.

YujingWu_10-1669641401764.jpegFigure 11. Fill out the form to create a new Event hub instance  

Obtain Connection String from Azure Event Hubs Namespace 

For GeoEvent to receive data from or send data to Azure Event Hubs, it needs the connection string from the Azure Event Hubs namespace where the event hub instance lives. To obtain this connection string, navigate to the homepage of the namespace and click on Shared Access Policies under Settings as shown in Figure 12 below. Make sure this is the shared access policies of the event hub namespace instead of the event hub instance.  

YujingWu_11-1669641401765.pngFigure 12. Navigate to the shared access policies of the event hub namespace 

Under Policy, click on RootManageSharedAccessKey. Among the list of keys, there should be a Connection string-primary key. This is the connection string that is needed to configure a GeoEvent input or output to work with Azure Event Hub.  

YujingWu_12-1669641401770.pngFigure 13. Connection string of the event hub namespace 

Configure GeoEvent

Configure GeoEvent to Ingest Data from Azure Event Hubs 

To configure a GeoEvent input to ingest data from Azure Event Hubs, navigate to Manager tab in GeoEvent Manager and click on Add InputFigure 14. Click on Add Input on GeoEvent Manager 

The input types that can ingest data from Azure Event Hubs can be found under the category Apache Kafka. In this blog, we will configure a Subscribe to a Kafka Topic for JSON input as an example.  

YujingWu_13-1669641401771.pngFigure 15. List of available input types on GeoEvent to work with Azure Event Hubs 

The Subscribe to a Kafka Topic for JSON input can be configured as the example shown in Figure 16 below. In the Kafka Bootstrap Servers field, fill in the hostname of your Azure Event Hubs namespace and specify the port to be 9093. One thing to note is that this port needs to be open on the firewall for Kafka to communicate with Azure Event Hubs. The Topic Name should be the name of your event hub instance. You can choose to create a new GeoEvent definition based on the incoming data or select an existing GeoEvent definition that matches the schema of the incoming data. 

YujingWu_0-1669644237753.png

Figure 16. The configuration panel of the Subscribe to a Kafka Topic for JSON input 

Scrolling down, there is an Advanced section at the bottom of the configuration panel. Fill out the fields as shown in Figure 17. The value of the SASL PLAIN Password should be the connection string obtained from the Azure Event Hubs namespace. Finally, click on the Save button to save the configuration. Start the GeoEvent input by clicking on the start button. 

YujingWu_1-1669644266703.pngFigure 17. The Advanced section of the configuration panel of the Subscribe to a Kafka Topic for JSON input 

Configure GeoEvent to Send Data to Azure Event Hubs 

To configure a GeoEvent output to send data to Azure Event Hub, navigate to Manager tab and click on Add Output.

YujingWu_2-1669644619871.png

Figure 18. Click on Add Output on GeoEvent Manager 

The output types that can write data to Azure Event Hubs can be found under the category Apache Kafka. In this blog, we will configure a Write JSON to a Kafka Topic output as an example. 

YujingWu_3-1669644647526.png

Figure 19. List of available output types on GeoEvent to work with Azure Event Hubs 

The Write JSON to a Kafka Topic output can be configured as the example shown in Figure 20 below. In the Kafka Bootstrap Servers field, fill in the hostname of your Azure Event Hubs namespace and specify the port to be 9093. One thing to note is that this port needs to be open on the firewall for Kafka to communicate with Azure Event Hubs. The Topic Name should be the name of your event hub instance.  

In the Advanced section, the value of the SASL PLAIN Password should be the connection string obtained from the Azure Event Hubs namespace. Finally, click on the Save button to save the configuration. Start the GeoEvent output by clicking on the start button.

YujingWu_4-1669644677597.png

Figure 20. The configuration panel of the Write JSON to a Kafka Topic output 

Testing the Kafka Connectors 

To test the Kafka connectors in GeoEvent, we can send data to the event hub via a GeoEvent Kafka output and ingest the data from the same event hub via a Kafka input.  

In GeoEvent, follow the steps outlined above to create a Write JSON to a Kafka Topic output.  

On your machine, create a JSON file containing the test data below. The JSON file should be placed in a folder registered to GeoEvent.  

[ 

    { 

        "TrackID": "DLN-04-KVZ", 

        "ReportedDT": 1662506383605, 

        "Geometry": { 

            "x": 97.455303, 

            "y": 34.640127, 

            "spatialReference": { 

                "wkid": 4326 

            } 

        } 

    }, 

    { 

        "TrackID": "LHD-09-HNM", 

        "ReportedDT": 1662506383651, 

        "Geometry": { 

            "x": -108.367645, 

            "y": 64.236518, 

            "spatialReference": { 

                "wkid": 4326 

            } 

        } 

    }, 

    { 

        "TrackID": "DPA-58-BCX", 

        "ReportedDT": 1662506383697, 

        "Geometry": { 

            "x": 108.022871, 

            "y": 14.868515, 

            "spatialReference": { 

                "wkid": 4326 

            } 

        } 

    } 

] 

Create a Watch a Folder for New JSON Files input to read the test data. The input should be configured as in Figure 21 and 22 below. The value in the Input Folder DataStore field should be replaced by your own folder that contains the test data as a JSON file. 

YujingWu_5-1669644770529.pngFigure 21. Configuration of watch a folder for new JSON files input 

YujingWu_6-1669644793627.png

Figure 22. Configuration of watch a folder for new JSON files input 

Create a GeoEvent service as shown in Figure 23 to send the JSON data from the Watch a Folder for New JSON Files input to the Write JSON to a Kafka Topic output. Both input and output should be stopped at this point. 

YujingWu_0-1669646111592.png

Figure 23. Simple GeoEvent service to send the test data to Azure Event Hub 

Follow the steps outlined in the previous section to create a Subscribe to a Kafka Topic for JSON input. It should be ingesting data from the Azure event hub the Kafka output is sending data to. After the input has been created, navigate to Manager tab in GeoEvent Manager. As shown in Figure 24 below, there should be a GeoEvent service, one input to read JSON data from file, one input to read JSON data from an Azure event hub, and an output that sends JSON data to the same Azure event hub.

YujingWu_1-1669646150434.png

Figure 24. GeoEvent Manager with the necessary elements to test Kafka connectors  

On GeoEvent Manager, first start the Kafka input and output. When they are both running without error, start the GeoEvent service. Finally, start the input that reads JSON data from file. If the Kafka connectors are working correctly, then the counts should be consistent among the Kafka input and output. The amount of the data sent to Azure Event Hub from GeoEvent should match the data ingested by GeoEvent from Azure Event Hub. 

YujingWu_2-1669646174961.png

Figure 25. GeoEvent Manager after all components have been started  

Contributors:

  • Morakot Pilouk, Ph.D., Author
  • Yujing Wu, Author
2 Comments
GerardoGarcia
New Contributor III

@YujingWu thanks for the article. However, I don't see anything specific to a version. Is something available for connectivity to event hubs for GeoEvent Server at version 10.7.1?

BrockForsythe
New Contributor

Hi @YujingWu thanks for this detailed article. Using 11.0, I am attempting to connect an Azure Eventhub to this Geoevent Server hosted within an Enterprise Portal VM with the goal of reading Eventhub data to a file within the VM using your connection steps above. I'm running into an issue where it seems no data is entering or exiting through the input/output connectors attached to the Geoevent Service. Would you have any idea what might cause this issue?