Skip to main content
Users
CloudPortalLogin
  • Powered byPowered by
Developing with Sitecore CDP
Data privacy
Before you start sending data
Integrating with Sitecore CDP
Stream API
Batch API
REST APIs
Data lake export service
  • Sitecore CDP for developers
  • Data lake export service
  • Writing SQL queries

Writing SQL queries

If you're familiar with SQL, use this topic to help you write SQL queries on data exported from your organization's data lake as described in Accessing your data. You can query data to build insights into your organization's site performance.

These queries draw from six tables:

  • guests

  • sessions

  • events

  • orders

  • order items

  • experience definitions

You can see the relationship between these tables in the entity relationship diagram. The data includes facts (numerical data) and dimensions (non-numerical data).

The following SQL queries use the same data that's rendered in the Summary Report.

Querying guest types

To identify the proportion of guest types—specifically VISITOR and CUSTOMER—who are visiting your organization’s website, start by extracting the relevant guest-session records:

RequestResponse
select type as guest_type, 
      s.core_channel, 
      g.core_first_seen_at, 
      g.meta_ref 
from guests g
left join sessions s on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR','CUSTOMER')
  and s.meta_created_at_date >= current_date - interval '8' day
  and s.meta_created_at_date <= current_date - interval '1' day

This query does the following:

  • Retrieves all guest-session combinations from the past 8 days (excluding today).

  • Filters to include only VISITOR and CUSTOMER guest types.

  • Joins session-level data (like core_channel) with guest records using the meta_ref identifier.

  • Filters by meta_created_at_date to take advantage of table partitioning for performance.

After extracting the data, you can calculate the number of unique guests per type and their proportions using your Business Intelligence (BI) tool (Power BI, Tableau, or Looker) by grouping by guest_type and counting distinct meta_ref.

Or use this SQL query:

RequestResponse
select 
    g.type as guest_type,
    count(distinct g.meta_ref) as guest_count,
    round(
        count(distinct g.meta_ref) * 100.0 / 
        sum(count(distinct g.meta_ref)) over (), 
        2
    ) as guest_percentage
from guests g
left join sessions s 
    on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR','CUSTOMER')
  and s.meta_created_at_date >= current_date - interval '8' day
  and s.meta_created_at_date <= current_date - interval '1' day
group by g.type

This query returns the number and percentage of unique guests per type.

Note

In the Summary Report, guest types are visualized as a donut chart, making it easy to see the distribution between visitors and customers over the selected date range.

Querying new visitors and new customers

To monitor new traffic to your website, you can identify new visitors and new customers based on when they were first seen. Start by extracting all VISITOR and CUSTOMER records whose first appearance on the site occurred within the past 8 full days:

RequestResponse
select   g.type, 
         s.core_channel, 
         g.core_first_seen_at,
         g.meta_ref 
from guests g
left join sessions s on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR','CUSTOMER')
  and g.core_first_seen_at >= current_date - interval '8' day
  and g.core_first_seen_at <= current_date - interval '1' day
  and s.meta_created_at_date >= current_date - interval '8' day
  and s.meta_created_at_date <= current_date - interval '1' day

This query does the following:

  • Uses a structure similar to the guest types query.

  • Filters by core_first_seen_at to identify guests who are new within the specified timeframe.

  • Filters by s.meta_created_at_date to ensure sessions fall within the same period.

  • Joins session data to allow breakdowns by channel or timestamp.

After extracting the data, calculate the number of unique new guests by type using your BI tool by grouping by guest_type and counting distinct meta_ref.

Or use this SQL query:

RequestResponse
select 
    g.type as guest_type,
    s.core_channel,
    g.core_first_seen_at,
    count(distinct g.meta_ref) as new_guest_count
from guests g
left join sessions s 
    on s.meta_guest_ref = g.meta_ref
where g.type in ('VISITOR', 'CUSTOMER')
  and g.core_first_seen_at >= current_date - interval '8' day
  and g.core_first_seen_at <= current_date - interval '1' day
  and s.meta_created_at_date >= current_date - interval '8' day
  and s.meta_created_at_date <= current_date - interval '1' day
group by g.type, s.core_channel, g.core_first_seen_at

This query groups by guest type, channel, and first seen date. It also counts distinct guests to avoid double-counting across multiple sessions.

Note

In the Summary Report, the number of new visitors and customers is displayed as a trend over the past eight days.

Querying order items

You can run this query to capture the total value and volume of orders by product category.

RequestResponse
select distinct
       oi.meta_created_at,
       oi.type,
       oi.bxt['channel'] as core_channel,
       oi.core_price,
       oi.core_quantity
from order_items oi
join orders o on o.meta_ref = oi.meta_order_ref
where oi.meta_created_at >= current_date - interval '8' day
  and oi.meta_created_at <= current_date - interval '1' day
  and oi.core_status in ('PURCHASED','CONFIRMED')

This query does the following:

  • Selects order items created in the last 7 days with a status of PURCHASED or CONFIRMED.

  • Joins the orders table with the order_items table via the order reference (meta_ref and meta_order_ref) to enrich the dataset if needed.

  • Extracts core_channel from the bxt JSON-like field.

  • Retrieves the unit price and quantity for each item.

  • Uses distinct to avoid duplicate rows.

Note

In the Summary Report:

  • Order items value is calculated by multiplying core_price x core_quantity and aggregating.

  • Order item quantity is calculated using SUM(core_quantity)

Querying inbound and outbound channels

To analyze traffic patterns, you can track inbound and outbound channels across recent guest sessions:

RequestResponse
select meta_created_at_date,
       core_channel,
       core_point_of_sale,
       meta_ref
from sessions
where meta_created_at_date >= current_date - interval '8' day
  and meta_created_at_date <= current_date - interval '1' day
  and core_channel is not null

This query does the following:

  • Filters sessions by the partitioned column meta_created_at_date for efficient execution.

  • Excludes sessions where the core_channel is null— these typically result from missing or incomplete data.

  • Retrieves date, channel, point of sale, and guest identifier for downstream analysis.

After extracting the data, calculate the number of sessions per channel using your BI tool by grouping by core_channel and counting meta_ref. You can also create a custom field to label each core_channel as either inbound or outbound based on your definitions.

Or use this SQL query:

RequestResponse
select 
    core_channel,
    count(meta_ref) as session_count,
    round(
        count(meta_ref) * 100.0 / 
        sum(count(meta_ref)) over (), 
        2
    ) as channel_percentage
from sessions
where meta_created_at_date >= current_date - interval '8' day
  and meta_created_at_date <= current_date - interval '1' day
  and core_channel is not null
group by core_channel

This query returns the total session count and percentage per channel.

Note

In the Summary Report, inbound and outbound channels are displayed as separate trend graphs.

Querying cart types and proportions

To understand user behavior during shopping sessions, you can analyze cart types and measure their proportions across all sessions.

RequestResponse
select meta_created_at_date,
       bxt['cartType'] as cart_type,
       core_point_of_sale,
       meta_ref
from sessions
where meta_created_at_date >= current_date - interval '8' day
  and meta_created_at_date <= current_date - interval '1' day

This query does the following:

  • Retrieves the cartType from the bxt field in the sessions table.

  • Filters by meta_created_at_date, a partitioned column, for efficient performance.

  • Returns all sessions within the past 8 days.

The cart_type can be one of:

  • BROWSED - no cart interaction

  • ABANDONED - item added to cart but not purchased

  • CONVERTED - purchase made

After extracting the data, calculate the number and proportion of sessions for each cart type using your BI tool by grouping by cart_type and counting meta_ref.

Or use this SQL query:

RequestResponse
select 
    bxt['cartType'] as cart_type,
    count(meta_ref) as session_count,
    round(
        count(meta_ref) * 100.0 / 
        sum(count(meta_ref)) over (), 
        2
    ) as cart_type_percentage
from sessions
where meta_created_at_date >= current_date - interval '8' day
  and meta_created_at_date <= current_date - interval '1' day
group by bxt['cartType']

This query returns the number of sessions per cart type and their percentage of total sessions.

Note

In the Summary Report, session types and cart proportion are displayed as separate graphs.

Querying conversions by day and hour

To understand when users are most likely to convert, you can analyze session-level conversions by day of the week and hour of day. This query builds on the data retrieved from the cart type query.

RequestResponse
select meta_created_at_date,
       day_of_week(meta_created_at_date) as day_of_week,
       core_channel,
       core_point_of_sale,
       extract(hour from meta_created_at) as hour_of_day,
       bxt['cartType'] as cartType,
       meta_ref
from sessions
where meta_created_at_date >= current_date - interval '8' day
  and meta_created_at_date <= current_date - interval '1' day
  and bxt['cartType'] is not null

This query does the following:

  • Extracts the day_of_week and hour_of_day from session timestamps.

  • Filters to include only sessions with a cartType— indicating some form of engagement (browsing, abandoning, or converting).

  • Returns additional metadata like channel and point of sale.

After extracting the data, calculate session volume per cart type using your BI tool by grouping by day_of_week, hour_of_day, and cart_type, and counting meta_ref.

Or use this SQL query:

RequestResponse
select 
    day_of_week(meta_created_at_date) as day_of_week,
    extract(hour from meta_created_at) as hour_of_day,
    bxt['cartType'] as cart_type,
    count(meta_ref) as session_count
from sessions
where meta_created_at_date >= current_date - interval '8' day
  and meta_created_at_date <= current_date - interval '1' day
  and bxt['cartType'] is not null
group by day_of_week, hour_of_day, bxt['cartType']
order by day_of_week, hour_of_day

This query returns session counts per cart type for each day-hour combination. You can pivot the result in your BI tool to calculate conversion rates like:

conversion rate = COUNT(CONVERTED) / COUNT(ALL_CART_TYPES)

Querying event types

To analyze user interactions on your website or platform, you can extract session-level event types such as viewing content, searching, adding to cart, and completing a purchase.

RequestResponse
select s.meta_ref
             ,s.meta_created_at_date
             ,e.type
             ,e.core_channel
             ,e.core_point_of_sale
      from sessions s
      join events e ON s.meta_ref=e.meta_session_ref
      where e.meta_created_at_date >= current_date - interval '8' day
      and e.meta_created_at_date <= current_date - interval '1' day
      and e.type IN ('VIEW','SEARCH','PAYMENT','CLEAR_CART','CONFIRM','CHECKOUT')

This query does the following:

  • Joins the events and sessions tables using meta_ref and meta_session_ref to provide session context for user actions.

  • Filters for six common event types.

  • Uses meta_created_at_date to retrieve data from the past 8 days.

After extracting the data, use your BI tool to group by event_type and count distinct meta_session_ref to measure how frequently each event type occurs.

Or use this SQL query:

RequestResponse
select 
    e.type as event_type,
    count(distinct e.meta_session_ref) as session_count
from events e
where e.meta_created_at_date >= current_date - interval '8' day
  and e.meta_created_at_date <= current_date - interval '1' day
  and e.type in ('VIEW', 'SEARCH', 'PAYMENT', 'CLEAR_CART', 'CONFIRM', 'CHECKOUT')
group by e.type
order by session_count desc

This query returns the number of unique sessions associated with each event type, helping you quantify user engagement at different stages.

Querying the experience definitions table

To analyze flow configurations and performance, you can join the experience_definitions dimension table with the events table. This reveals which variant users saw, whether it was a control or test group, and the split ratio.

This query assumes you're working with the TRACKING event type, which is used in experiences and experiments.

RequestResponse
select core_friendly_id
             ,TRANSFORM(cast(json_parse(json_format(cast(core_variants as json))) as ARRAY<JSON>),x->JSON_EXTRACT_SCALAR(x,'$[1]')) as variant_name
             ,TRANSFORM(cast(json_parse(json_format(cast(core_traffic.splits as json))) as ARRAY<JSON>),x->JSON_EXTRACT_SCALAR(x,'$[3]')) as split
             ,e.flow_execution_flow_ref
             ,e.flow_execution_flow_variant_ref
             ,TRANSFORM(cast(json_parse(json_format(cast(core_variants as json))) as ARRAY<JSON>),x->JSON_EXTRACT_SCALAR(x,'$[2]')) as is_Control
			 ,e.meta_session_ref
	  from events e
	  join experience_definitions ed
	  ON e.flow_execution_flow_ref=ed.meta_ref
	  where meta_created_at_date >= current_date - interval '8' day
      and meta_created_at_date <= current_date - interval '1' day

This query does the following:

  • Retrieves flow_ref, flow_variant_ref, and meta_session_ ref from the events table.

  • Retrieves additional information, such as the friendly_id of the flow_ref, the variant name, and whether the A/B test variant is set as control or test, from the experience_definitions table.

  • Joins the events table with the experience_definitions table using flow_execution_flow_ref from the events table and meta_ref from the experience_definitions table.

  • Retrieves traffic split details between control and test groups during an A/B test from the core_traffic field, and determines the start date and when the test was last run using the core_schedule field.

  • Filters results based on the flow_execution_created_at timestamp field.

Depending on your BI tool or programming language, you can extract data from the core_variant, core_schedule, and core_traffic columns in various ways. Note that core_variant column is an array, while core_schedule, and core_traffic columns are structured (or struct) types.

Querying UTM parameters

If your website uses UTM (Urchin Tracking Module) parameters for campaign tracking, you can extract them from session data to analyze traffic sources and campaign performance.

RequestResponse
select bxt['utmCampaign'] as utmCampaign from sessions 
where bxt['utmCampaign'] is not null 
limit 100

This query does the following:

  • Retrieves the utmCampaign value from the bxt field.

  • Filters results to include only rows where utmCampaign is not null.

  • Limits the result to the 100 most recent UTM campaigns.

Note

UTM parameters like utmSource, utmMedium, utmCampaign, utmTerm, and utmContent are typically stored as key-value pairs in the bxt field.

You can use similar queries to extract specific UTM parameters, such as source, medium, term, or content, providing insights into how different campaigns and traffic sources contribute to user sessions.

Do you have some feedback for us?

If you have suggestions for improving this article,

Privacy policySitecore Trust CenterCopyright © 1999-2026 Sitecore