Doc version recovery
If you never ran Server Pro version 5.0.1 or Community Edition version 5.0.1, or you started a brand new instance with 5.0.1, you do not need to run this recovery process.
Updates to this page:
(2024-04-22 13:40 BST): Added step "Stop new updates from getting into the system and flush all changes to MongoDB".
(2024-04-23 11:45 BST): Account for broken flushes in 5.0.1 and skip flushes when 5.0.2 was started.
The duration of the recovery will depend on the number and size of projects in your instance and the storage backend used by the history store for chunks (as defined in OVERLEAF_HISTORY_CHUNKS_BUCKET
).
The recovery process will delay the application start inside the Server Pro container. The site will appear offline during that time. We only support running the recovery from a single instance of the Server Pro container, all other horizontal scaling workers need to be offline.
You can stop and resume the recovery process if needed.
Based on our performance tests, the recovery process can process approximately 10k small projects per minute on modern hardware (3GHz CPU clock speed and local NVMe storage). As an example, for an instance with 100k projects, schedule a maintenance window that allows at least 10+2min of downtime. Use the following query to estimate the number of projects in your instance:
Please read the following recovery steps in full before you start. Server Pro customers are more than welcome to reach out to support@overleaf.com with any questions.
Recovery process
Stop new updates and flush all changes to MongoDB
Stop new updates from getting into the system and flush all changes to MongoDB:
Close the editor and disconnect all users manually via the admin panel on
https://my-server-pro.example.com/admin#open-close-editor
in the "Open/Close Editor" tab.Stop the Websocket/real-time service.
Wait for the real-time service to exit, as indicated by
down:
.Stop the git-bridge container if enabled.
If you never ran 5.0.2: Issue a manual flush for document updates and wait for it to finish with success.
You can repeat the command on error. In case you see a non-zero
failureCount
in successive runs, please stop the migration (restore the services viadocker restart git-bridge sharelatex
) and reach out to support.If you never ran 5.0.2: Ensure that all changes have been flushed out of redis.
If you get any output from
redis-cli
, please stop the migration (restore the services viadocker restart git-bridge sharelatex
) and reach out to support.Try to flush any pending history changes.
This will need to be a best effort flush as some projects have broken histories due to the bad database migration. Any failures will be addressed with a re-sync of the history at the end of the recovery process.
Take a backup
Consider taking a consistent backup of the instance.
Validate the recovery process
Validate the recovery process by opening the history pane for a few of the projects with previously missing history.
Expedite the resync for the projects to test (They will get processed eventually, but we do not want to wait for them to get their turn.)
(Repeat with each of the project-ids to test, replace
000000000000000000000000
with one project-id at a time.)Open the project editor for the projects
https://my-server-pro.example.com/project/000000000000000000000000
Open the "History" pane for the project and see the latest content.
Optional: Close the "History" pane again. Make a code change, such as adding a comment to the header.
Optional: Issue a re-compile to trigger a flush of the local change. Open the "History" pane again and see the change. When done, undo the change.
Keep the instance running
Please keep the instance running that executed the recovery process. It will resync the history for all projects in the background with a concurrency of 1. This will result in slightly elevated base load. (You can restart the instance, but it will need to start over with the resyncs.)
Last updated
Was this helpful?