Horizontal scaling

Starting with version 3.5.6 Server Pro supports horizontal scaling.

This document lists the technical requirements and provides guidelines for running Server Pro in more than one node.

Setting up horizontal scaling requires a significant amount of effort. We advise considering horizontal scaling only when reaching a certain scale. As an example, a Server Pro installation for 1,000 total users has been set up successfully using a single server provisioned with two 4-core processors and 32GB of system memory. See the hardware requirements documentation for recommendations.

A deployment of Server Pro with horizontal scaling involves a set of external components, such as a Load Balancer and an S3-compatible storage backend.

We can help troubleshoot errors in the Server Pro containers that might be the result of misconfiguration and provide general advice based on this document. Unfortunately, we are unable to provide assistance configuring third-party applications/systems.

Resolving technical issues specific to your hardware/software to provide the external components are not covered by our support terms.

Requirements

External, central data storage

The data storage in Server Pro can be split into four data stores:

  • MongoDB

    • Most of the data is persisted into MongoDB.

    • We support either a local instance or an external instance, such as MongoDB Atlas (a fully managed MongoDB service that runs within the AWS infrastructure).

    Note: Unfortunately, there is no official support for MongoDB compatible databases such as CosmoDB/DocumentDB at the moment, as we have not tested Server Pro with them. While deploying Server Pro with compatible databases may be possible, we only officially support deployments using MongoDB.

  • Redis

    • Redis stores temporary data, such as pending document updates before they are flushed to MongoDB.

    • Redis is used for communicating document updates between different services and notifying the editor about state changes in a given project.

    • Redis is used for storing the user sessions.

    • We support either a local instance or an external instance.

    Note: Unfortunately, there is no official support for Redis compatible key/values stores such as KeyDB/Valkey at the moment, as we have not tested Server Pro with them. While deploying Server Pro with compatible stores may be possible, we only officially support deployments using Redis.

  • Project files and History files

    • Non-editable project files are stored outside of MongoDB.

      The new project history system (Server Pro 3.5 onwards) stores the history outside MongoDB as well.

    • For small single instances we support either a local file system (which could be backed by a local SSD, NFS or EBS) or a S3 compatible data storage system.

    • For horizontal scaling, we only support S3 compatible data storage systems.

    Important: NFS/Amazon EFS/Amazon EBS are not supported for horizontal scaling. Please see the hardware storage requirements section on scaling storage in Server Pro for more details.

  • Ephemeral files

    • LaTeX compiles need to run on fast, local disks for optimal performance. The output of the compilation does not need to persisted or backed up.

    • Buffering of new file uploads and the creation of project zip files also benefits from using a local disk.

Git-bridge

Git-bridge is available in Server Pro starting with version 4.0.1.

The git-repositories are stored locally on disk. There are no replication options available. Git-bridge should be run as a singleton. For optimal performance, we advise on using a local disk for git-bridge data. The git-bridge data disk should be backed up regularly.

For the data storage with horizontal scaling, you need:

  • a central MongoDB instance that is accessible from all Server Pro instances

  • a central Redis instance that is accessible from all Server Pro instances

  • a central S3 compatible storage backend for project and history files

  • a local disk on each instance for ephemeral files

  • a local disk on the instance that hosts the git-bridge container for git-bridge data

Load balancer requirements

  • Persistent routing, e.g. using a cookie

    This requirement stems from these components:

    • The real-time editing capability in Server Pro uses WebSockets with a fallback to XHR polling. Each editing session has local state on the server side and the requests of a given editing session always need to be routed to the same Server Pro instance. The collaboration feature uses Redis Pub/Sub for sharing updates between multiple Server Pro instances.

    • The LaTeX compilation keeps the output and compile cache locally for optional performance. Upon issuing a compile request to one Server Pro instance, the following PDF/log download requests need to be routed to the same Server Pro instance.

  • Long request timeouts to support the compilation of large LaTeX documents

  • WebSocket support for optimal performance

  • POST payload size of 50MB

  • Keep-alive timeout must be lower than the Server Pro keep-alive timeout

    The keep-alive timeout inServer Pro can be configured using the environment variable NGINX_KEEPALIVE_TIMEOUT. The default value is 65s.

    With the default, a keep-alive timeout of 60s in the load balancer works.

    With NGINX_KEEPALIVE_TIMEOUT=120, the load balancer could pick 115s.

  • Client IPs

    Set the request header X-Forwarded-For to the client IP.

  • When terminating SSL

    The load balancer needs to add the request header X-Forwarded-Proto: https.

Sample HAProxy configuration

Server Pro configuration

Secrets

The Server Pro instances need to agree on shared secrets:

  • WEB_API_PASSWORD (web api auth)

  • STAGING_PASSWORD and V1_HISTORY_PASSWORD same value (history auth)

  • CRYPTO_RANDOM (for session cookie)

  • OT_JWT_AUTH_KEY (history auth)

All of these secrets need to be configured with their own unique value and shared between the instances.

When not configured and user requests get routed to different Server Pro instances, their request will fail authentication checks and they either get redirect to the login page frequently or their actions in the UI will fail in unexpected ways.

When not configured, Server Pro uses a new random value for each secret based on 32 random bytes from /dev/urandom (256 random bits).

MongoDB

Point OVERLEAF_MONGO_URL (SHARELATEX_MONGO_URL for versions 4.x and earlier) at the central MongoDB instance.

Redis

Point OVERLEAF_REDIS_HOST (SHARELATEX_REDIS_HOST for versions 4.x and earlier) and REDIS_HOST at the central Redis instance.

S3 compatible storage for project and history files

Please see the documentation on S3 compatible storage for details.

Ephemeral files

The default bind-mount of a local SSD to /var/lib/overleaf (/var/lib/sharelatex for versions 4.x and earlier) will be sufficient. Be sure to point SANDBOXED_COMPILES_HOST_DIR at the mount point on the host.

Proxy configuration

  • Set OVERLEAF_BEHIND_PROXY=true (SHARELATEX_BEHIND_PROXY for versions 4.x and earlier) for accurate client IPs.

  • Set TRUSTED_PROXY_IPS to the IP of the load balancer (Multiple CIDRs can be specified, separated with a comma).

Git-bridge integration

Git-bridge is available in Server Pro starting with version 4.0.1.

The git-bridge container needs a sibling Server Pro container for handling incoming git requests. This sibling container can serve regular user traffic as well. In the sample configuration, the first instance acts as sibling container for git-bridge, but any instance could function as that really.

Why do we need to designate one Server Pro container as sibling for git-bridge? Server Pro hands out download URLs for the history service to git-bridge. We need to configure these history URLs to be accessible from the git-bridge container.

Server Pro container config:

  • Set GIT_BRIDGE_ENABLED to 'true'

  • Set GIT_BRIDGE_HOST to <git-bridge container name> e.g. git-bridge

  • Set GIT_BRIDGE_PORT to 8000

  • Set V1_HISTORY_URL to http://<server-pro sibling container name>:3100/api.

    Note: This is only necessary on the sibling container for the git-bridge container. The other instances can use a localhost URL, which is the default.

git-bridge container config:

  • Set GIT_BRIDGE_API_BASE_URL to http://<server-pro sibling container name>/api/v0, e.g. http://server-pro-ha-1/api/v0

  • Set GIT_BRIDGE_OAUTH2_SERVER to http://<server-pro sibling container name>, e.g. http://server-pro-ha-1

  • Set GIT_BRIDGE_POSTBACK_BASE_URL to http://<git-bridge container name>:8000, e.g. http://git-bridge:8000

  • Set GIT_BRIDGE_ROOT_DIR to the bind-mounted git-bridge data disk, e.g. /data/git-bridge

Sample docker-compose.yml configuration

The following configuration is showing a self-contained setup. For the demo to work, you need to provide a valid SSL key/certificate and adjust the OVERLEAF_SITE_URL (SHARELATEX_SITE_URL for versions 4.x and earlier). For an actual setup, you must replace the dummy secrets with actual secrets as noted inline. For an actual setup, you need to move the individual containers onto dedicates nodes and adjust the IP addresses to your local network setup.

Hardware

We recommend using the same hardware specifications for all the Server Pro instances that are taking part in horizontal scaling.

The general recommendations on hardware specifications for Server Pro instances apply.

Upgrading Server Pro

As part of the upgrade process, Server Pro automatically runs database migrations. These migrations are not designed to be run from multiple instances in parallel.

The migrations need to finish before the actual web application is started. You can either check the logs for an entry of Finished migrations or wait until the application accepts traffic.

The upgrade procedure looks like this:

  1. Schedule a maintenance window

  2. Stop all the instances of Server Pro

  3. Take a consistent backup as described in the documentation

  4. Start a single instance of Server Pro with the new version

  5. Validate that the new instance is working as expected

  6. Bring up the other instances with the new version

Last updated

Was this helpful?