Writing SQL queries
If you're familiar with SQL, use this topic to help you write SQL queries on data exported from your organization's data lake as described in Accessing your data. You can query data to build insights into your organization's site performance.
These queries draw from six tables:
You can see the relationship between these tables in the entity relationship diagram. The data includes facts (numerical data) and dimensions (non-numerical data).
The following SQL queries use the same data that's rendered in the Summary Report.
Querying guest types
To identify the proportion of guest types—specifically VISITOR and CUSTOMER—who are visiting your organization’s website, start by extracting the relevant guest-session records:
This query does the following:
-
Retrieves all guest-session combinations from the past 8 days (excluding today).
-
Filters to include only
VISITORandCUSTOMERguest types. -
Joins session-level data (like
core_channel) with guest records using themeta_refidentifier. -
Filters by
meta_created_at_dateto take advantage of table partitioning for performance.
After extracting the data, you can calculate the number of unique guests per type and their proportions using your Business Intelligence (BI) tool (Power BI, Tableau, or Looker) by grouping by guest_type and counting distinct meta_ref.
Or use this SQL query:
This query returns the number and percentage of unique guests per type.
In the Summary Report, guest types are visualized as a donut chart, making it easy to see the distribution between visitors and customers over the selected date range.
Querying new visitors and new customers
To monitor new traffic to your website, you can identify new visitors and new customers based on when they were first seen. Start by extracting all VISITOR and CUSTOMER records whose first appearance on the site occurred within the past 8 full days:
This query does the following:
-
Uses a structure similar to the guest types query.
-
Filters by
core_first_seen_atto identify guests who are new within the specified timeframe. -
Filters by
s.meta_created_at_dateto ensure sessions fall within the same period. -
Joins session data to allow breakdowns by channel or timestamp.
After extracting the data, calculate the number of unique new guests by type using your BI tool by grouping by guest_type and counting distinct meta_ref.
Or use this SQL query:
This query groups by guest type, channel, and first seen date. It also counts distinct guests to avoid double-counting across multiple sessions.
In the Summary Report, the number of new visitors and customers is displayed as a trend over the past eight days.
Querying order items
You can run this query to capture the total value and volume of orders by product category.
This query does the following:
-
Selects order items created in the last 7 days with a status of
PURCHASEDorCONFIRMED. -
Joins the
orderstable with theorder_itemstable via the order reference (meta_refandmeta_order_ref) to enrich the dataset if needed. -
Extracts
core_channelfrom thebxtJSON-like field. -
Retrieves the unit price and quantity for each item.
-
Uses
distinctto avoid duplicate rows.
In the Summary Report:
-
Order items value is calculated by multiplying
core_pricexcore_quantityand aggregating. -
Order item quantity is calculated using
SUM(core_quantity)
Querying inbound and outbound channels
To analyze traffic patterns, you can track inbound and outbound channels across recent guest sessions:
This query does the following:
-
Filters sessions by the partitioned column
meta_created_at_datefor efficient execution. -
Excludes sessions where the
core_channelis null— these typically result from missing or incomplete data. -
Retrieves date, channel, point of sale, and guest identifier for downstream analysis.
After extracting the data, calculate the number of sessions per channel using your BI tool by grouping by core_channel and counting meta_ref. You can also create a custom field to label each core_channel as either inbound or outbound based on your definitions.
Or use this SQL query:
This query returns the total session count and percentage per channel.
In the Summary Report, inbound and outbound channels are displayed as separate trend graphs.
Querying cart types and proportions
To understand user behavior during shopping sessions, you can analyze cart types and measure their proportions across all sessions.
This query does the following:
-
Retrieves the
cartTypefrom thebxtfield in thesessionstable. -
Filters by
meta_created_at_date, a partitioned column, for efficient performance. -
Returns all sessions within the past 8 days.
The cart_type can be one of:
-
BROWSED- no cart interaction -
ABANDONED- item added to cart but not purchased -
CONVERTED- purchase made
After extracting the data, calculate the number and proportion of sessions for each cart type using your BI tool by grouping by cart_type and counting meta_ref.
Or use this SQL query:
This query returns the number of sessions per cart type and their percentage of total sessions.
In the Summary Report, session types and cart proportion are displayed as separate graphs.
Querying conversions by day and hour
To understand when users are most likely to convert, you can analyze session-level conversions by day of the week and hour of day. This query builds on the data retrieved from the cart type query.
This query does the following:
-
Extracts the
day_of_weekandhour_of_dayfrom session timestamps. -
Filters to include only sessions with a
cartType— indicating some form of engagement (browsing, abandoning, or converting). -
Returns additional metadata like channel and point of sale.
After extracting the data, calculate session volume per cart type using your BI tool by grouping by day_of_week, hour_of_day, and cart_type, and counting meta_ref.
Or use this SQL query:
This query returns session counts per cart type for each day-hour combination. You can pivot the result in your BI tool to calculate conversion rates like:
conversion rate = COUNT(CONVERTED) / COUNT(ALL_CART_TYPES)
Querying event types
To analyze user interactions on your website or platform, you can extract session-level event types such as viewing content, searching, adding to cart, and completing a purchase.
This query does the following:
-
Joins the
eventsandsessionstables usingmeta_refandmeta_session_refto provide session context for user actions. -
Filters for six common event types.
-
Uses
meta_created_at_dateto retrieve data from the past 8 days.
After extracting the data, use your BI tool to group by event_type and count distinct meta_session_ref to measure how frequently each event type occurs.
Or use this SQL query:
This query returns the number of unique sessions associated with each event type, helping you quantify user engagement at different stages.
Querying the experience definitions table
To analyze flow configurations and performance, you can join the experience_definitions dimension table with the events table. This reveals which variant users saw, whether it was a control or test group, and the split ratio.
This query assumes you're working with the TRACKING event type, which is used in experiences and experiments.
This query does the following:
-
Retrieves
flow_ref,flow_variant_ref, andmeta_session_ reffrom theeventstable. -
Retrieves additional information, such as the
friendly_idof theflow_ref, the variant name, and whether the A/B test variant is set as control or test, from theexperience_definitionstable. -
Joins the
eventstable with theexperience_definitionstable usingflow_execution_flow_reffrom theeventstable andmeta_reffrom theexperience_definitionstable. -
Retrieves traffic split details between control and test groups during an A/B test from the
core_trafficfield, and determines the start date and when the test was last run using thecore_schedulefield. -
Filters results based on the
flow_execution_created_attimestamp field.
Depending on your BI tool or programming language, you can extract data from the core_variant, core_schedule, and core_traffic columns in various ways. Note that core_variant column is an array, while core_schedule, and core_traffic columns are structured (or struct) types.
Querying UTM parameters
If your website uses UTM (Urchin Tracking Module) parameters for campaign tracking, you can extract them from session data to analyze traffic sources and campaign performance.
This query does the following:
-
Retrieves the
utmCampaignvalue from thebxtfield. -
Filters results to include only rows where
utmCampaignis not null. -
Limits the result to the 100 most recent UTM campaigns.
UTM parameters like utmSource, utmMedium, utmCampaign, utmTerm, and utmContent are typically stored as key-value pairs in the bxt field.
You can use similar queries to extract specific UTM parameters, such as source, medium, term, or content, providing insights into how different campaigns and traffic sources contribute to user sessions.