Default workers

Projection worker

Version:

The projection worker (ProjectionWorker) is a distributed worker that is responsible for projecting data into a tabular structure and storing the results in the Cortex Processing Engine Storage database (Storage database). Default usages of the projection worker include projecting contact and interaction data into formats that can be used to train a machine learning model.

To use the projection worker, register a projection worker task that includes a projection worker options dictionary and a data source:

// Syntax example only
Guid taskId = await taskManager.RegisterDistributedTaskAsync(
    new ContactDataSourceOptionsDictionary(new ContactExpandOptions(), 5, 10),
    new ContactProjectionWorkerOptionsDictionary(typeof(SampleModel).AssemblyQualifiedName, TimeSpan.MaxValue, "SampleSchemaName", new Dictionary<string, string> { ["TestCaseId"] = "Id" }),
    new TimeSpan(0, 10, 0));

Important

A merge worker task should always be registered in a task chain after a projection worker task.

Projection worker options dictionaries

A projection worker options dictionary defines:

An implementation of Sitecore.Processing.Engine.ML.Abstractions.IModel, which includes:
- The projection itself, which determines the structure of the projected data.

Note

Note: An IModel is required even if you are using projection outside of a machine learning context. If you do not require the training or evaluation methods that are included in this interface, you can return empty results.

There are two default options dictionaries that inherit a base ProjectionWorkerOptionsDictionary class:

ContactProjectionWorkerOptionsDictionary
InteractionProjectionWorkerOptionsDictionary

You can define a custom options dictionary that inherits ProjectionWorkerOptionsDictionary - for example, you may want to project data from a database of orders.

Data source options dictionary

The projection worker is a distributed worker and requires a data source.

Using projection and merge together

Multiple projection workers can work on a single projection task in parallel. Workers write to separate tables, which need to be merged into a single table when the projection task is complete. Merging is done by a separate merge worker (MergeWorker) and you will almost never register a projection task without also registering a merge task.

If you have suggestions for improving this article, let us know!