Observability
For peak performance, Sitecore Content Hub uses an observability strategy consisting of three main pillars as illustrated in the following diagram:

Pillar |
Description |
Tools used |
---|---|---|
Watch |
To evaluate performance, we collect data using industry-leading tools. We constantly monitor the infrastructure, operating system, and application metrics. In addition to technology metrics, we integrate business metrics into the data, focusing on the customer experience and expected performance. |
Elastic Stack, Prometheus, Site 24x7 |
Learn |
The collected data is aggregated to create meaningful dashboards. These dashboards provide a critical visual aid for the immediate identification of issues. Long-term aggregation of data provides the ability to predict or identify potential trends that could lead to failure and creates opportunities to act before customer environments are impacted. |
TIG (Telegraph, InfluxDB, Grafana) |
Act |
When issues or potential impacts are identified, our suite of alerting tools works in unison to collect the relevant information and generate alerts targeted to the specific person, team, or automation agent required to facilitate rapid response and recovery. |
OpsGenie |
Observability and alerting tools
Observability and alerting tools enable Content Hub to have constant insight and receive continuous feedback from our systems through monitoring and logs.
Content Hub uses the following tools:
Tool |
Description |
---|---|
Elastic Stack |
An open-source logging platform used to consume and deliver detailed logging information in a unified format for easy ingestion and aggregation. |
Prometheus |
Fits both machine-centric monitoring and monitoring of highly dynamic service-oriented architectures. On top of providing multi-dimensional data collection and powerful querying, Prometheus can monitor Kubernetes environments, which makes it a must-have for Content Hub. |
Site 24x7 |
Provides a global perspective of website performance from more than 100 locations worldwide, checking that public-facing websites and APIs that access back-end services are up, performing well, and returning the expected data. |
Metrics tools
Tools for machine and service metrics provide standards for monitoring that provide great versatility and accommodate almost any business monitoring need.
Content Hub uses the TIG (Telegraph, InfluxDB, Grafana) tools stack:
Tool |
Description |
---|---|
Telegraf |
Active agent used to collect metrics. |
InfluxDB |
Time-series database used to store metrics collected by Telegraf. |
Grafana |
Metric analytics and visualization suite that provides the ability to visualize time series data for infrastructure and application analysis. Aggregates and visualizes data from InfluxDB. Enables the creation of pre-defined alerting rules. |
Incident management tool
Content Hub uses OpsGenie as an incident management platform.
OpsGenie:
-
checks that critical incidents are never missed and that the right people take appropriate actions in the shortest possible time.
-
categorizes the alerts received from monitoring systems and custom applications based on importance and timing.
-
provides on-call schedules to notify the appropriate people through multiple communication channels (voice calls, email, SMS, and push messages) with automatic escalation procedures.