Troubleshoot Redis timeouts

Current version: 10.2

Redis timeouts (RedisTimeoutException) occur due to various conditions caused by either the server or the client. This topic helps you to find answers for some of the most common timeout scenarios. For more information, you can also refer to Github and Microsoft Azure documentation.

Server- and client-side issues

Learn how to solve issues that occur because of an Azure Cache for Redis condition, or the virtual machine(s) hosting it.

Possible solutions for infrastructure-related causes:

Cause

Solution

Bandwidth/size limits

Monitor Redis Server metrics to check whether you have reached bandwidth/size limits. Possible solutions:

  • Upgrade pricing tier for Redis server.

  • Make sure compression=”true” is enabled for Redis session state provider. Compressing session state data reduces the amount of data that you transfer between the Sitecore instance and the Redis database.

    Note

    This solution might cause additional CPU overhead.

High client CPU/memory/bandwidth usage

High CPU, memory, or bandwidth usage can cause the request to not be processed within the operationTimeoutInMilliseconds interval and cause a request to time out. Possible solutions:

  • Investigate what is consuming CPU, memory, and/or bandwidth.

  • Upgrade your client to a larger VM or App size with more CPU/memory/bandwidth.

  • Make sure compression=”true” is enabled for Redis session state provider. Compressing session state data reduces the amount of data that you transfer between the Sitecore instance and the Redis database.

    Note

    This solution might cause additional CPU overhead.

Server and client application are not in the same region in Azure

We recommend that you have the cache and the client in the same Azure region. For example, timeouts might occur when your cache is in East US but the client is in West US and the request does not complete within the operationTimeoutInMilliseconds interval.

Application issues

High load can cause RedisTimeoutException to occur due to a combination of reasons. Try to adjust your application setup/load to solve the problem.

Possible solutions for application-related causes:

Cause

Solution

Thread limits

If the RedisTimeoutException message contains the value Busy for WORKER and IOCP and these are values greater then the Min value from the message, you can change the parameters for ThreadPoolSizeMonitor that monitors the ThreadPool size and adjusts the Min value to the load:

Important

Do not make changes directly to the configuration files. Instead, you must create a patch file that performs the required changes during runtime.

  • Patch parameters under: sitecore/pipelines/initialize/processor[type="Sitecore.Analytics.Pipelines.Loader.StartThreadPoolSizeMonitor, Sitecore.Analytics"].

    Important

    If you set the values too high, it might slow down your application.

Too many requests

The StackExchange.Redis client uses a single TCP connection and can only read one response at a time. Even when a first operation times out, it does not stop the data being sent to/from the server. Because of this, it blocks other requests and causes timeouts. In the RedisTimeoutException message, the parameter qs indicates how many requests are sent but have not processed a response.

Possible solution:

  • On the Redis session state provider, use a pool of ConnectionMultiplexer objects in your client by adjusting the connectionPoolSize setting. This setting controls the number of ConnectionMultiplexers created for communication with Redis. The default value is 1.

    To find the optimal value for your environment, you could try dividing the number of machine CPU cores by two. For example, use connectionPoolSize="8" for 16 core machines.

Low values for operationTimeoutInMilliseconds or retryTimoutInMilliseconds

The Sitecore Redis session state provider uses configuration settings named operationTimeoutInMilliseconds and retryTimeoutInMilliseconds for operations. If a call does not complete in the time specified, the Redis client throws a timeout error.

Possible solution:

  • Increase the values of the operationTimeoutInMilliseconds and/or retryTimeoutInMilliseconds settings.

  • Make sure the retryTimeoutInMilliseconds value is higher than the operationTimeoutInMilliseconds value. Otherwise, no retries occur.

Do you have some feedback for us?

If you have suggestions for improving this article,