Data and backups
Some times we need to change the schema of data in the database as we evolve Overleaf, migration scripts are used to automate this process. They will have been run on overleaf.com first which is the largest instance of Overleaf in the world so most eventualities will have been encounter already, however we make no guarantees over your data. Please ensure that you create a consistent backup of your data before upgrading you instance.
Data storage
Overleaf Community Edition and Server Pro store their data in three separate places:
MongoDB Database: This is where user and project data reside.
Redis: serves as a high-performance cache for in-flight data, primarily storing information related to project editions and collaboration.
Overleaf Filesystem: stores non-editable project files (including images) and also acts as a temporary disk cache during project compilations.
For project files and full project history data we also support S3 compatible storage backends.
See Folders in detail for more information on the folder layout on disk.
Performing a consistent backup
There are three stores which need to included when taking a consistent backup:
MongoDB
Redis
Overleaf Filesystem data
In order to produce a consistent backup it is mandatory to stop users from producing new data while the backup process is running. We therefore advise scheduling a maintenance window during which, users should not be able to access the instance or edit their projects.
Before you start the backup process you will need to take your instance offline. Starting with Server Pro 3.5.0
the shutdown down process automates the closing of the site and the disconnection of users.
To shutdown your instance you'll need to run bin/docker-compose stop sharelatex
if you are running a Toolkit deployment or docker compose stop sharelatex
if you are running Docker Compose.
Once the sharelatex
container has been stopped you can start the backup process.
Once the backup process has been completed successfully you'll need to start the sharelatex
container. To do this run bin/docker-compose start sharelatex
if you are running a Toolkit deployment or docker compose start sharelatex
if you are running Docker Compose.
Backups should be stored on a separate server from the one your Overleaf instance is running on, ideally in a different location entirely.
Replicating databases onto multiple MongoDB instances might offer some redundancy, but it doesn't safeguard against corruption.
Testing your backups is the best way to ensure they are complete and functional.
MongoDB
MongoDB comes with a command-line tool called mongodump which can be used to create a backup of user and project data stored in the database.
Overleaf Filesystem data
For Toolkit deployments, the path where your non-editable files are stored is specified in config/overleaf.rc
using the OVERLEAF_DATA_PATH
environment variable, but, depending on when your instnace was created, this might be data/sharelatex
.
Using a tool such as rsync to recursively copy this directory is required to ensure a complete backup is created.
Redis
Redis stores user sessions and pending document updates before they are flushed to MongoDB.
Append Only File (AOF) persistence is the recommended configuration for Redis persistence.
Toolkit users have AOF persistence enabled by default for new installs, existing users can find more information regarding enabling AOF here.
If you decide to continue using RDB snapshots along with AOF persistence you can copy the RDB file to a secure location as a backup.
Migrating data between servers
At best you do not have any valuable data in the new instance yet. We do not have a process for merging the data of instances.
Assuming the new instance has no data yet, here are some steps you could follow. On a high level, we produce a tar-ball of the mongo
, redis
and overleaf
volumes, copy it over to the new server, and inflate it there again.
Toolkit
# Gracefully shutdown the old instance
old-server$ bin/stop
# Create the tar-ball
old-server$ tar --create --file backup-old-server.tar config/ data/
# Copy the backup-old-server.tar file from the old-server to the
# new-server using any method that fits
# Gracefully shutdown new instance (if started yet)
new-server$ bin/stop
# Move new data, you can delete it too
new-server$ mkdir backup-new-server
new-server$ mv config/ data/ backup-new-server/
# Populate config/data dir again
new-server$ tar --extract --file backup-old-server.tar
# Start containers
new-server$ bin/up
Docker Compose
# Gracefully shutdown the old instance
old-server$ docker stop sharelatex
old-server$ docker stop mongo redis
# Create the tar-ball
old-server$ tar --create --file backup-old-server.tar ~/OVERLEAF_data ~/mongo_data ~/redis_data
# Copy the backup-old-server.tar file from the old-server to
# the new-server using any method that fits
# Gracefully shutdown new instance (if started yet)
new-server$ docker stop sharelatex
new-server$ docker stop mongo redis
# Move new data, you can delete it too
new-server$ mkdir backup-new-server
new-server$ mv ~/OVERLEAF_data ~/mongo_data ~/redis_data backup-new-server/
# Populate data dirs again
new-server$ tar --extract --file backup-old-server.tar
# Start containers
new-server$ docker start mongo redis
new-server$ docker start sharelatex
Depending on your docker-compose.yml file, you may need to adjust the paths of the mongo
, redis
, overleaf
volumes.
Folders in detail
~/mongo_data
(b)mongodb datadir
~/redis_data
(b)redis db datadir
~/overleaf_data
bin
synctex (d)
unused in latest release, previously a custom synctex binary was used (synctex is used for source mapping between .tex files and the pdf)
data
cache (e)
binary file cache for compiles
compiles (e)
latex compilation happens here
db.sqlite (d)
unused in latest release, previously stored clsi cache details (either moved to simple in-memory maps or we scan the disk)
db.sqlite-wal (d)
unused in latest release, see db.sqlite
output (e)
latex compilation output storage for serving to client
template_files (b)
image previews of template system (Server Pro only)
user_files (b)
binary files of projects
history (b)
full project history files
tmp
dumpFolder (e)
temporary files from handling zip files
uploads (e)
buffering of file uploads (binary file/new-project-from-zip upload)
projectHistories (e)
temporary files for full project history migrations
Last updated
Was this helpful?