Writing SQL queries
If you're familiar with SQL, use this topic to help you write SQL queries on data exported from your organization's data lake as described in Accessing your data. You can query data to build insights into your organization's site performance.
These queries draw from six tables:
You can see the relationship between these tables in the entity relationship diagram. The data includes facts (numerical data) and dimensions (non-numerical data).
The following SQL queries use the same data that's rendered in the Summary Report.
Querying guest types
To identify the proportion of guest types—specifically VISITOR and CUSTOMER—who are visiting your organization’s website, start by extracting the relevant guest-session records:
select type as guest_type,
s.core_channel,
g.core_first_seen_at,
g.meta_ref
from guests g
left join sessions s on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR','CUSTOMER')
and s.meta_created_at_date >= current_date - interval '8' day
and s.meta_created_at_date <= current_date - interval '1' day
This query does the following:
-
Retrieves all guest-session combinations from the past 8 days (excluding today).
-
Filters to include only
VISITORandCUSTOMERguest types. -
Joins session-level data (like
core_channel) with guest records using themeta_refidentifier. -
Filters by
meta_created_at_dateto take advantage of table partitioning for performance.
After extracting the data, you can calculate the number of unique guests per type and their proportions using your Business Intelligence (BI) tool (Power BI, Tableau, or Looker) by grouping by guest_type and counting distinct meta_ref.
Or use this SQL query:
select
g.type as guest_type,
count(distinct g.meta_ref) as guest_count,
round(
count(distinct g.meta_ref) * 100.0 /
sum(count(distinct g.meta_ref)) over (),
2
) as guest_percentage
from guests g
left join sessions s
on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR','CUSTOMER')
and s.meta_created_at_date >= current_date - interval '8' day
and s.meta_created_at_date <= current_date - interval '1' day
group by g.type
This query returns the number and percentage of unique guests per type.
In the Summary Report, guest types are visualized as a donut chart, making it easy to see the distribution between visitors and customers over the selected date range.
Querying new visitors and new customers
To monitor new traffic to your website, you can identify new visitors and new customers based on when they were first seen. Start by extracting all VISITOR and CUSTOMER records whose first appearance on the site occurred within the past 8 full days:
select g.type,
s.core_channel,
g.core_first_seen_at,
g.meta_ref
from guests g
left join sessions s on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR','CUSTOMER')
and g.core_first_seen_at >= current_date - interval '8' day
and g.core_first_seen_at <= current_date - interval '1' day
and s.meta_created_at_date >= current_date - interval '8' day
and s.meta_created_at_date <= current_date - interval '1' day
This query does the following:
-
Uses a structure similar to the guest types query.
-
Filters by
core_first_seen_atto identify guests who are new within the specified timeframe. -
Filters by
s.meta_created_at_dateto ensure sessions fall within the same period. -
Joins session data to allow breakdowns by channel or timestamp.
After extracting the data, calculate the number of unique new guests by type using your BI tool by grouping by guest_type and counting distinct meta_ref.
Or use this SQL query:
select
g.type as guest_type,
s.core_channel,
g.core_first_seen_at,
count(distinct g.meta_ref) as new_guest_count
from guests g
left join sessions s
on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR', 'CUSTOMER')
and g.core_first_seen_at >= current_date - interval '8' day
and g.core_first_seen_at <= current_date - interval '1' day
and s.meta_created_at_date >= current_date - interval '8' day
and s.meta_created_at_date <= current_date - interval '1' day
group by g.type, s.core_channel, g.core_first_seen_at
This query groups by guest type, channel, and first seen date. It also counts distinct guests to avoid double-counting across multiple sessions.
In the Summary Report, the number of new visitors and customers is displayed as a trend over the past eight days.
Querying order items
You can run this query to capture the total value and volume of orders by product category.
select distinct
oi.meta_created_at,
oi.type,
oi.bxt['channel'] as core_channel,
oi.core_price,
oi.core_quantity
from order_items oi
join orders o on o.meta_ref = oi.meta_order_ref
where oi.meta_created_at >= current_date - interval '8' day
and oi.meta_created_at <= current_date - interval '1' day
and oi.core_status in ('PURCHASED','CONFIRMED')
This query does the following:
-
Selects order items created in the last 7 days with a status of
PURCHASEDorCONFIRMED. -
Joins the
orderstable with theorder_itemstable via the order reference (meta_refandmeta_order_ref) to enrich the dataset if needed. -
Extracts
core_channelfrom thebxtJSON-like field. -
Retrieves the unit price and quantity for each item.
-
Uses
distinctto avoid duplicate rows.
In the Summary Report:
-
Order items value is calculated by multiplying
core_pricexcore_quantityand aggregating. -
Order item quantity is calculated using
SUM(core_quantity)
Querying inbound and outbound channels
To analyze traffic patterns, you can track inbound and outbound channels across recent guest sessions:
select meta_created_at_date,
core_channel,
core_point_of_sale,
meta_ref
from sessions
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' day
and core_channel is not nullThis query does the following:
-
Filters sessions by the partitioned column
meta_created_at_datefor efficient execution. -
Excludes sessions where the
core_channelis null— these typically result from missing or incomplete data. -
Retrieves date, channel, point of sale, and guest identifier for downstream analysis.
After extracting the data, calculate the number of sessions per channel using your BI tool by grouping by core_channel and counting meta_ref. You can also create a custom field to label each core_channel as either inbound or outbound based on your definitions.
Or use this SQL query:
select
core_channel,
count(meta_ref) as session_count,
round(
count(meta_ref) * 100.0 /
sum(count(meta_ref)) over (),
2
) as channel_percentage
from sessions
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' day
and core_channel is not null
group by core_channel
This query returns the total session count and percentage per channel.
In the Summary Report, inbound and outbound channels are displayed as separate trend graphs.
Querying cart types and proportions
To understand user behavior during shopping sessions, you can analyze cart types and measure their proportions across all sessions.
select meta_created_at_date,
bxt['cartType'] as cart_type,
core_point_of_sale,
meta_ref
from sessions
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' day
This query does the following:
-
Retrieves the
cartTypefrom thebxtfield in thesessionstable. -
Filters by
meta_created_at_date, a partitioned column, for efficient performance. -
Returns all sessions within the past 8 days.
The cart_type can be one of:
-
BROWSED- no cart interaction -
ABANDONED- item added to cart but not purchased -
CONVERTED- purchase made
After extracting the data, calculate the number and proportion of sessions for each cart type using your BI tool by grouping by cart_type and counting meta_ref.
Or use this SQL query:
select
bxt['cartType'] as cart_type,
count(meta_ref) as session_count,
round(
count(meta_ref) * 100.0 /
sum(count(meta_ref)) over (),
2
) as cart_type_percentage
from sessions
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' day
group by bxt['cartType']
This query returns the number of sessions per cart type and their percentage of total sessions.
In the Summary Report, session types and cart proportion are displayed as separate graphs.
Querying conversions by day and hour
To understand when users are most likely to convert, you can analyze session-level conversions by day of the week and hour of day. This query builds on the data retrieved from the cart type query.
select meta_created_at_date,
day_of_week(meta_created_at_date) as day_of_week,
core_channel,
core_point_of_sale,
extract(hour from meta_created_at) as hour_of_day,
bxt['cartType'] as cartType,
meta_ref
from sessions
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' day
and bxt['cartType'] is not null
This query does the following:
-
Extracts the
day_of_weekandhour_of_dayfrom session timestamps. -
Filters to include only sessions with a
cartType— indicating some form of engagement (browsing, abandoning, or converting). -
Returns additional metadata like channel and point of sale.
After extracting the data, calculate session volume per cart type using your BI tool by grouping by day_of_week, hour_of_day, and cart_type, and counting meta_ref.
Or use this SQL query:
select
day_of_week(meta_created_at_date) as day_of_week,
extract(hour from meta_created_at) as hour_of_day,
bxt['cartType'] as cart_type,
count(meta_ref) as session_count
from sessions
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' day
and bxt['cartType'] is not null
group by day_of_week, hour_of_day, bxt['cartType']
order by day_of_week, hour_of_day
This query returns session counts per cart type for each day-hour combination. You can pivot the result in your BI tool to calculate conversion rates like:
conversion rate = COUNT(CONVERTED) / COUNT(ALL_CART_TYPES)
Querying event types
To analyze user interactions on your website or platform, you can extract session-level event types such as viewing content, searching, adding to cart, and completing a purchase.
select s.meta_ref
,s.meta_created_at_date
,e.type
,e.core_channel
,e.core_point_of_sale
from sessions s
join events e ON s.meta_ref=e.meta_session_ref
where e.meta_created_at_date >= current_date - interval '8' day
and e.meta_created_at_date <= current_date - interval '1' day
and e.type IN ('VIEW','SEARCH','PAYMENT','CLEAR_CART','CONFIRM','CHECKOUT')This query does the following:
-
Joins the
eventsandsessionstables usingmeta_refandmeta_session_refto provide session context for user actions. -
Filters for six common event types.
-
Uses
meta_created_at_dateto retrieve data from the past 8 days.
After extracting the data, use your BI tool to group by event_type and count distinct meta_session_ref to measure how frequently each event type occurs.
Or use this SQL query:
select
e.type as event_type,
count(distinct e.meta_session_ref) as session_count
from events e
where e.meta_created_at_date >= current_date - interval '8' day
and e.meta_created_at_date <= current_date - interval '1' day
and e.type in ('VIEW', 'SEARCH', 'PAYMENT', 'CLEAR_CART', 'CONFIRM', 'CHECKOUT')
group by e.type
order by session_count descThis query returns the number of unique sessions associated with each event type, helping you quantify user engagement at different stages.
Querying the experience definitions table
To analyze flow configurations and performance, you can join the experience_definitions dimension table with the events table. This reveals which variant users saw, whether it was a control or test group, and the split ratio.
This query assumes you're working with the TRACKING event type, which is used in experiences and experiments.
select core_friendly_id
,TRANSFORM(cast(json_parse(json_format(cast(core_variants as json))) as ARRAY<JSON>),x->JSON_EXTRACT_SCALAR(x,'$[1]')) as variant_name
,TRANSFORM(cast(json_parse(json_format(cast(core_traffic.splits as json))) as ARRAY<JSON>),x->JSON_EXTRACT_SCALAR(x,'$[3]')) as split
,e.flow_execution_flow_ref
,e.flow_execution_flow_variant_ref
,TRANSFORM(cast(json_parse(json_format(cast(core_variants as json))) as ARRAY<JSON>),x->JSON_EXTRACT_SCALAR(x,'$[2]')) as is_Control
,e.meta_session_ref
from events e
join experience_definitions ed
ON e.flow_execution_flow_ref=ed.meta_ref
where meta_created_at_date >= current_date - interval '8' day
and meta_created_at_date <= current_date - interval '1' dayThis query does the following:
-
Retrieves
flow_ref,flow_variant_ref, andmeta_session_ reffrom theeventstable. -
Retrieves additional information, such as the
friendly_idof theflow_ref, the variant name, and whether the A/B test variant is set as control or test, from theexperience_definitionstable. -
Joins the
eventstable with theexperience_definitionstable usingflow_execution_flow_reffrom theeventstable andmeta_reffrom theexperience_definitionstable. -
Retrieves traffic split details between control and test groups during an A/B test from the
core_trafficfield, and determines the start date and when the test was last run using thecore_schedulefield. -
Filters results based on the
flow_execution_created_attimestamp field.
Depending on your BI tool or programming language, you can extract data from the core_variant, core_schedule, and core_traffic columns in various ways. Note that core_variant column is an array, while core_schedule, and core_traffic columns are structured (or struct) types.
Querying UTM parameters
If your website uses UTM (Urchin Tracking Module) parameters for campaign tracking, you can extract them from session data to analyze traffic sources and campaign performance.
select bxt['utmCampaign'] as utmCampaign from sessions
where bxt['utmCampaign'] is not null
limit 100This query does the following:
-
Retrieves the
utmCampaignvalue from thebxtfield. -
Filters results to include only rows where
utmCampaignis not null. -
Limits the result to the 100 most recent UTM campaigns.
UTM parameters like utmSource, utmMedium, utmCampaign, utmTerm, and utmContent are typically stored as key-value pairs in the bxt field.
You can use similar queries to extract specific UTM parameters, such as source, medium, term, or content, providing insights into how different campaigns and traffic sources contribute to user sessions.