Data and backups
Some times we need to change the schema of data in the database as we evolve Overleaf, migration scripts are used to automate this process. They will have been run on overleaf.com first which is the largest instance of Overleaf in the world so most eventualities will have been encounter already, however we make no guarantees over your data. Please ensure that you create a consistent backup of your data before upgrading you instance.
Data storage
Overleaf Community Edition and Server Pro store their data in three separate places:
MongoDB Database: This is where user and project data reside.
Redis: serves as a high-performance cache for in-flight data, primarily storing information related to project editions and collaboration.
Overleaf Filesystem: stores non-editable project files (including images) and also acts as a temporary disk cache during project compilations.
This might be ~/sharelatex_data
or ~/overleaf_data
, depending on when you're instance was set up.
For project files and full project history data we also support S3 compatible storage backends.
See Folders in detail for more information on the folder layout on disk.
Performing a consistent backup
There are three stores which need to included when taking a consistent backup:
MongoDB
Redis
Overleaf Filesystem data
In order to produce a consistent backup it is mandatory to stop users from producing new data while the backup process is running. We therefore advise scheduling a maintenance window during which, users should not be able to access the instance or edit their projects.
Before you start the backup process you will need to take your instance offline. Starting with Server Pro 3.5.0
the shutdown down process automates the closing of the site and the disconnection of users.
To shutdown your instance you'll need to run bin/docker-compose stop sharelatex
if you are running a Toolkit deployment or docker compose stop sharelatex
if you are running Docker Compose.
Once the sharelatex
container has been stopped you can start the backup process.
Once the backup process has been completed successfully you'll need to start the sharelatex
container. To do this run bin/docker-compose start sharelatex
if you are running a Toolkit deployment or docker compose start sharelatex
if you are running Docker Compose.
Backups should be stored on a separate server from the one your Overleaf instance is running on, ideally in a different location entirely.
Replicating databases onto multiple MongoDB instances might offer some redundancy, but it doesn't safeguard against corruption.
Testing your backups is the best way to ensure they are complete and functional.
MongoDB
MongoDB comes with a command-line tool called mongodump which can be used to create a backup of user and project data stored in the database.
Overleaf Filesystem data
For Toolkit deployments, the path where your non-editable files are stored is specified in config/overleaf.rc
using the OVERLEAF_DATA_PATH
environment variable, but, depending on when your instnace was created, this might be data/sharelatex
.
Using a tool such as rsync to recursively copy this directory is required to ensure a complete backup is created.
Redis
Redis stores user sessions and pending document updates before they are flushed to MongoDB.
Append Only File (AOF) persistence is the recommended configuration for Redis persistence.
Toolkit users have AOF persistence enabled by default for new installs, existing users can find more information regarding enabling AOF here.
If you decide to continue using RDB snapshots along with AOF persistence you can copy the RDB file to a secure location as a backup.
Migrating data between servers
At best you do not have any valuable data in the new instance yet. We do not have a process for merging the data of instances.
Assuming the new instance has no data yet, here are some steps you could follow. On a high level, we produce a tar-ball of the mongo
, redis
and overleaf
volumes, copy it over to the new server, and inflate it there again.
Toolkit
Docker Compose
Depending on your docker-compose.yml file, you may need to adjust the paths of the mongo
, redis
, overleaf
volumes.
Folders in detail
The following folders have additional hints:
(b) include in backups, best when the instance is stopped to ensure consistency
(d) can be deleted
(e) ephemeral files, can be deleted when the instance is stopped
~/mongo_data
(b)mongodb datadir
~/redis_data
(b)redis db datadir
~/overleaf_data
bin
synctex (d)
unused in latest release, previously a custom synctex binary was used (synctex is used for source mapping between .tex files and the pdf)
data
cache (e)
binary file cache for compiles
compiles (e)
latex compilation happens here
db.sqlite (d)
unused in latest release, previously stored clsi cache details (either moved to simple in-memory maps or we scan the disk)
db.sqlite-wal (d)
unused in latest release, see db.sqlite
output (e)
latex compilation output storage for serving to client
template_files (b)
image previews of template system (Server Pro only)
user_files (b)
binary files of projects
history (b)
full project history files
tmp
dumpFolder (e)
temporary files from handling zip files
uploads (e)
buffering of file uploads (binary file/new-project-from-zip upload)
projectHistories (e)
temporary files for full project history migrations
Last updated
Was this helpful?