I recently wrote about using clickstream events collectors, such as Snowplow or Divolte, to power more reliable and deeper analytics. It is however, possible to create your clickstream event collector in a few clicks using Microsoft’s Cloud.
- Logic app: The role of Azure logic is only to capture the message and push it to an event hub “topic”.
- EventHub: EventHub is there to host the events for real-time processing, a specific setting called data capture allows to export the data onto an Azure Blob Storage or DataLake Storage.
- DataLake Storage: Provides a long term storage for the data pushed to Event Hub
Once the data is in data lake storage, it is possible to query it using Microsoft’s Data Analytics USQL.
There are a few pros and cons related to leveraging this type of serverless solution to capturing clickstream data.
- Pro: No need for a load balancer for autoscaling or “application maintenance”
- Pro: Simple to setup
- Pro: More control over where your data is captured than if using GA360 (cloud hosting), particularly the case for certain companies in which AWS and GCP aren’t welcome
- Pro/Cons: Pricing is based on a per event, which can be cheap if low amounts of events, but can end up much more expensive than GA360 if a large amount of data to be collected
- Cons: More difficult to test than pure code
The Clickstream collector can be set up in a three-step process. First, setting up the storage layer (blog or DataLake), then setting up EventHub with Data Capture and finally create a logic app that will send events back to the event hub.
- The first step is to create a blob storage or a data lake storage. This is where the data will be hosted in the end.
2. The second step is to create an event hub with data capture turned on. This will export to blob/data lake storage the data ingested at specific (configurable) intervals.
3. The third step relates to the creation of the logic app. The logic app needs to be composed of only two components. An HTTP Request receives and sends event to the event hub.