Create a Clickstream event collector in 10 minutes using Azure Logic App

Create a Clickstream event collector in 10 minutes using Azure Logic App

Julien Kervizic
Photo by YUCAR FotoGrafik on Unsplash

I recently wrote about using clickstream events collectors, such as Snowplow or Divolte, to power more reliable and deeper analytics. It is however, possible to create your clickstream event collector in a few clicks using Microsoft’s Cloud.

Besides the tracking script, the Azure stack can handle all the functions of a clickstream collector with three components: Logic App, EventHub, and DataLake Storage.

  1. Tracking Script: A tracking script is a piece of javascript that will be downloaded by the browser, and that will track and send the different events to the clickstream collector.
  2. Logic app: The role of Azure logic is only to capture the message and push it to an event hub “topic”.
  3. EventHub: EventHub is there to host the events for real-time processing, a specific setting called data capture allows to export the data onto an Azure Blob Storage or DataLake Storage.
  4. DataLake Storage: Provides a long term storage for the data pushed to Event Hub

Once the data is in data lake storage, it is possible to query it using Microsoft’s Data Analytics USQL.

There are a few pros and cons related to leveraging this type of serverless solution to capturing clickstream data.

  • Pro: No need for a load balancer for autoscaling or “application maintenance”
  • Pro: Simple to setup
  • Pro: More control over where your data is captured than if using GA360 (cloud hosting), particularly the case for certain companies in which AWS and GCP aren’t welcome
  • Pro/Cons: Pricing is based on a per event, which can be cheap if low amounts of events, but can end up much more expensive than GA360 if a large amount of data to be collected
  • Cons: More difficult to test than pure code

The Clickstream collector can be set up in a three-step process. First, setting up the storage layer (blog or DataLake), then setting up EventHub with Data Capture and finally create a logic app that will send events back to the event hub.

  1. The first step is to create a blob storage or a data lake storage. This is where the data will be hosted in the end.

2. The second step is to create an event hub with data capture turned on. This will export to blob/data lake storage the data ingested at specific (configurable) intervals.

3. The third step relates to the creation of the logic app. The logic app needs to be composed of only two components. An HTTP Request receives and sends event to the event hub.

Source link

What do you think?


Leave a Reply

Your email address will not be published. Required fields are marked *

5 × 3 =


How Artificial Intelligence can help fight COVID-19

HOW AI Is Saving The Day In The World of COVID-19

R : word cloud from dataframe

R : word cloud from dataframe