Snapshots and Dumps
Snapshot vs Dump
MeiliSearch has two ways to backup its data: snapshots
and dumps
.
Snapshots make it possible to schedule the creation of hard copies of your database. This feature is intended mainly as a safeguard: ensuring that if some failure occurs, you’re able to relaunch your database quickly and efficiently from a snapshot. The documents in a snapshot are already “indexed” and ready to go, greatly increasing import speed. However, as a result, snapshots are not compatible between different versions of MeiliSearch.
Dumps, on the other hand, export MeiliSearch data in a way that is not bound to a specific MeiliSearch version. As a result, importing a dump requires MeiliSearch to index all of your documents. This process requires a certain amount of time and memory (corresponding to the number of documents, their size, and the complexity of any index settings).
To summarize, snapshots are highly efficient but not portable between different versions of MeiliSearch. Dumps, on the other hand, are highly portable but not very efficient, as frequently launching MeiliSearch from a dump would cause your performance to suffer.
Snapshots
A snapshot is an exact copy of the database (i.e. the data.ms (opens new window) folder) at the time the snapshot was created. Besides compression, snapshots do not go through any processing. They can be thought of as “pre-compiled copies”.
Using this feature, it is possible to schedule snapshot creation at custom intervals and use existing snapshots to restore MeiliSearch.
Creating Snapshots
For MeiliSearch to create snapshots, the feature must be enabled by adding the following flag:
$ meilisearch --schedule-snapshot=true
By default, MeiliSearch creates snapshots in a directory called snapshots/
at the root of your MeiliSearch.
The destination can be modified with the --snapshot-dir
flag.
$ meilisearch --schedule-snapshot=true --snapshot-dir mySnapShots/
Now snapshots are created in mySnapShots/
directory.
The first snapshot is created on launching MeiliSearch. After that, snapshots are created routinely on a set interval until you deactivate snapshots or end the MeiliSearch instance. By default, one snapshot is taken every 24 hours.
The amount of time between each new snapshot can be modified with the --snapshot-interval-sec
flag.
$ meilisearch --schedule-snapshot=true --snapshot-interval-sec 3600
After running the above code, a snapshot is created every hour (3600 seconds).
During snapshot creation, old snapshots are automatically overwritten. This means that only the most recent snapshot should be present in the folder at any given time.
[More about snapshots flags and env variables]
Start from Snapshot
Because snapshots are exact copies of your database that haven’t gone through any processing besides compression, starting a MeiliSearch instance from a snapshot is significantly faster than adding documents manually or starting from a dump.
Using the global environment MEILI_IMPORT_SNAPSHOT
or the CLI flag --import-snapshot
, MeiliSearch will start the server using the provided snapshot.
$ meilisearch --import-snapshot mySnapShots/data.ms.snapshot
Common Problems
Take note that whenever you launch MeiliSearch from a snapshot, it will stop processing and throw an error* if it encounters either of the two following situations:
- A database already exists (i.e. you have a non-empty
data.ms
folder in the same directory as your MeiliSearch binary) - No snapshot is found at the given path
In both cases, this behavior is configurable.
If you don’t want MeiliSearch to throw an error when finding that a database already exists, you can add the following flag: --ignore-snapshot-if-db-exists=true
. When using this flag, MeiliSearch will use the existing database to start an instance instead of throwing an error. The snapshot will be ignored.
If you do not want MeiliSearch to throw an error when there is no snapshot at the given path, you can add the following flag: --ignore-missing-snapshot
. MeiliSearch will then continue its process and not import any snapshot.
When starting from a snapshot, chances are that you already have an existing database. For security reasons, a database is never overwritten. To load a snapshot when an existing database is present, you will have to manually delete the existing database. By default, this is the contents of the data.ms
folder (unless you changed the path) which is located in the same folder as your MeiliSearch binary.
The simplest way to delete your database is with the terminal command rm -rf data.ms
, after which you should be able to start MeiliSearch with a snapshot.
[More about snapshots flags and env variables]
Use Cases
Snapshots are safeguards in case of problems. If your MeiliSearch instance encounters a problem or if you make a mistake while manipulating your database, restarting your instance with the latest snapshot is an easy way to recover your data.
Version Compatibility
Since a snapshot is an exact replica of your database, it can only be opened by the same version of MeiliSearch that created it.
Dumps
A dump is a compressed file containing an export of your MeiliSearch instance. It contains all your indexes, documents, and settings, but in a raw unprocessed form. A dump isn’t an exact copy of your database—it is closer to a blueprint that allows you to create an identical dataset. A dump can be imported when launching MeiliSearch, but be advised that it may take some time to index all the documents within.
Creating a Dump
To create a dump of your dataset, you need to use the appropriate HTTP route: POST /dumps
. Using that route will trigger a dump creation process. Creating a dump is an asynchronous task that takes time based on the size of your dataset. A dump uid (unique identifier) is returned to help you track the process.
$ curl -X POST 'http://localhost:7700/dumps'
client.createDump()
client.create_dump()
$client->createDump();
client.create_dump
The above code triggers a dump creation process.
At any given moment, you can check the status of a particular dump creation process using the previously received dump uid, like so: GET /dumps/:dump_uid/status
. Using this route, you can know whether your dump is still processing, has already been created, or has encountered a problem.
$ curl -X GET 'http://localhost:7700/dumps/20201101-110357260/status'
client.getDumpStatus("20201101-110357260")
client.get_dump_status('20201101-110357260')
$client->getDumpStatus('20201101-110357260');
client.get_dump_status(20201006-053243949)
After your dump creation process is done, the dump file is created and added in the dump folder. By default, this folder is /dumps
at the root of your MeiliSearch binary, but this can be customized. Note that if your dump folder does not already exist when the dump creation process is called, MeiliSearch will create it.
Import a Dump
Once you have exported a dump, which is a .dump
file, you are now able to use that dump to launch MeiliSearch. As the data contained in the dump needs to be indexed, the process will take some time to complete. Only when the dump has been fully imported will the MeiliSearch server start, after which you can begin searching through your data.
./meilisearch --import-dump /dumps/20200813-042312213.dump
Because importing a dump is the same process as when documents are initially indexed by MeiliSearch, it can require some time and memory. If your dataset is very large, it is good practice to index documents in larger batches. This will speed up the indexing process at the cost of memory.
See here for more dumps options
Use Cases
Dumps are used to restore your database after MeiliSearch updates or to communicate your database to other instances of MeiliSearch (e.g. running on different servers) without having to worry about their respective versions.