Implementing event analytics
This tutorial walks through how to transform raw event data into sessions. Many “out-of-box” web analytics solutions come already prepackaged with sessions, but they work as a “black box.” It doesn’t give the user either insight into or control how these sessions defined and work. With Cube SQL-based sessions data model, you’ll have full control over how these metrics are defined. It will give you great flexibility when designing sessions and events to your unique business use case. A few question we’ll answer with our sessions data model:- How do we measure session duration?
- What is our bounce rate?
- What areas of the app are most used?
- Where are users spending most of their time?
- How do we filter sessions where a user performs a specific action?
What is a session?
A session is defined as a group of interactions one user takes within a given time frame on your app. Usually that time frame defaults to 30 minutes, meaning that whatever a user does on your app (e.g. browses pages, downloads resources, purchases products) before they leave equals one session.Unify events and page views into single cube
Segment stores page view data as apages table and events data as a tracks
table. For sessions we want to rely not only on page views data, but on events
as well. Imagine you have a highly interactive app, a user loads a page and can
stay on this page interacting with the website for while. Hence, you want to
count events as part of the session as well.
To do that we need to combine page view data and event data into a single cube.
We’ll call the cube just events and assign a page views event type to
pageview. Also, we’re going to assign a unique event_id to every event to use
as primary key.
event column.
Creating Sessions
As a recap, a session is defined as a group of interactions one user takes within a given time frame on your app. Usually that time frame defaults to 30 minutes. First, we’re going to useLAG() function
in Redshift to determine an inactivity_time between events.
inactivity_time is the time in minutes between the current event and the
previous. We’re going to use inactivity_time to terminate a session based on
30 minutes of inactivity. This window could be changed to any value, based on
how users interact with your app. Now we’re ready to introduce our Sessions
cube.
session_id, which is the combination of
the anonymous_id and the session sequence, since it’s guaranteed to be unique
for each session. Having this in place, we can already count sessions and plot a
time series chart of sessions.
Connecting Events to Sessions
The next step is to identify the events contained within the session and the events ending the session. It’s required to get metrics such as session duration and events per session, or to identify sessions where specific events occurred (we’re going to use that for funnel analysis later on). We’re going to declare a join such that theevents
cube has a many_to_one relation to the sessions cube, and specify a
condition, such as all users’ events from session start (inclusive) till the
start of the next session (exclusive) belong to that session.
Mapping Sessions to Users
Right now all our sessions are anonymous, so the final step in our modeling would be to map sessions to users in case, they have signed up and have been assigned auser_id. Segment keeps track of such assignments in a table called
identifies. Every time you identify a user with segment it will connect the
current anonymous_id to the identified user id.
We’re going to create an identifies cube, which will not contain any visible
measures and dimensions for users to use in Insights, but instead will provide
us with a user_id to use in the Sessions cube. Also, identifies could be
used later on to join sessions to your users cube, which could be a cube
built based on your internal database data for users.
identifies and sessions, where
session has a many_to_one relationship with identity.
user_id, which will be either a
user_id from the identifies table or an anonymous_id in case we don’t have
the identity of a visitor, which means that this visitor never signed in.
More metrics for Sessions
Number of Events per Session
This one is super easy to add with a subquery dimension. We just calculate the number of events, which we already have as a measure in theevents cube, as a
dimension in the sessions cube.
Bounce Rate
we’ve just defined the number of events per session, we can easily add a dimensionis_bounced to identify bounced sessions to the Sessions cube. Using
this dimension, we can add two measures to the Sessions cube as well - a count
of bounced sessions and a bounce rate.
First Referrer
We already have this column in place in our base table. We’re just going to define a dimension on top of this.Sessions New vs Returning
Same as for the first referrer. We already have asession_sequence field in
the base table, which we can use for the is_first dimension. If
session_sequence is 1 - then it belongs to the first session, otherwise - to a
repeated session.
form_submitted_count on the Sessions using sub_query.
form_submitted_count is greater
than 0.
with_form_submitted_count measure to get only sessions when
the form_submitted event occurred.