43
loading...
This website collects cookies to deliver better user experience
Note: this post will assume a running MySQL server, simply because we use it ourselves. It should work for Percona server, too, for others (MariaDB) it’s left for you to try…
mysqldump
, one for each purpose, e.g. one full backup and another partial one for the developer imports. This of course made our export runs even longer.zstd
, a very fast compression algorithm by Facebook. mysqldump
, mysqlpump
and mydumper
) in most situations.$HOME/.mysqlsh/options.json
. For example, we have something like this in our shell-ready accounts:{
"history.autoSave": "true",
"history.maxSize": "10000",
"defaultMode": "sql",
"pager": "less -SFw"
}
#!/bin/bash
# Main script for exporting logical production db backups.
# It uses MySQL Shell to make an instance dump.
current_time=`date +"%FT%T"`
backup_dir="/mnt/backups/nere-db-backup-${current_time}"
# perform a full instance backup
mysqlsh -h localhost --js <<EOF
util.dumpInstance('${backup_dir}',
{ bytesPerChunk: '100M',
threads: 8,
compatibility: ['strip_definers'] })
EOF
util.dumpInstance()
utility to make a dump of all databases on our MySQL server instance. The tool creates a directory and exports all database objects into various files in it.bytesPerChunk
: We opt to divide larger tables into approximately 100 megabyte chunks of data (it’s the size before compression). Chunks allow the import utility to load even a single table with multiple threads, i.e. much faster.threads
: We use 8 threads to make the dump and recommend testing which number works best for you. Usually it’s a compromise between high export speed and the smallest possible negative impact on your production server.strip_definers
: This compatibility option strips the DEFINER
clause from views and other objects in the dump. This allows to re-create them even without having the same users and GRANT
s on the target MySQL instance.consistent
option, let us just state here that since we run the dump under the root
account, we never had any issues with locking related to the dumps, even under heavier production load conditions.Duration: 00:13:43s
Schemas dumped: 4
Tables dumped: 281
Uncompressed data size: 316.47 GB
Compressed data size: 23.02 GB
Compression ratio: 13.7
Rows written: 266414294
Bytes written: 23.02 GB
Average uncompressed throughput: 384.43 MB/s
Average compressed throughput: 27.97 MB/s
util.loadDump()
utility that MySQL shell provides, it is actually quite easy. The tool allows to load the whole dump or just a given database or even a particular table. @.done.json
, @.json
and similar: these files contain information about all the schemas, their tables and their chunks that the dump contains. They also store the options that the dump was exported with as well as SQL commands to re-create the database user accounts if needed.database_name.json
, database_name.sql
: these two files provide further information about the particular schema in the dump and a DDL SQL command with its definition.database_name@table_name.json
, database_name@table_name.sql
: again, these two files provide the metadata about the particular table as well as the SQL command to recreate the table structure. A similar pattern is used for other objects (views etc.), too.database_name@table_name@@chunk_number.tsv.zst
, database_name@table_name@@chunk_number.tsv.zst.idx
: finally, these files contain the table data in the TSV format, compressed with zstd
. If the data cannot be represented as a simple text, it is Base64-encoded. (We are not sure what the .idx
files are for, they seem to contain the length of the data in the main TSV file encoded in a binary format.)rsync
with a set of pre-configured --include
and --exclude
options under the hood to do that. Overall, we can typically download only about 2 GB of data out of the 23 GB dump.T
from database D
, the tool needs to find all chunked TSV data files for table D.T
+ the metadata files for table D.T
+ the metadata for the whole dump. If we wanted to import the whole schema D
, all data for D.*
tables + their metadata + the metadata of schema D
+ the whole dump metadata must be present. If the shell does not find the required files, it considers the dump incomplete and unfortunately just says ”No data loaded“ and exits without further hints.rsync
the relevant data and metadata files from the server and call the loadDump
shell utility with the includeTables
option to load them.loadDump
utility to load the ”whole“ database.zstd
-compressed empty TSV file is easy:echo -n '' | zstd > database_name@table_name@@chunk_number.tsv.zst
.idx
file (although we are not sure it’s actually needed) using a binary zero value, i.e. like this:echo -n -e '\x0\x0\x0\x0\x0\x0\x0\x0' \
> database_name@table_name@@chunk_number.tsv.zst.idx
rsync
. Now we have an incomplete local copy of the dump, with some of the data / metadata files missing. Such a dump cannot be loaded by the shell yet.DROP
s all objects (tables, views, etc.) that will be re-imported to the database. MySQL shell requires all loaded objects to be non-existent in the target database instance.loadDump
utility, to import ”the whole“ database. It uses several optimizations while doing this that we’ll mention below.mysqlsh -h localhost -u db_user --js <<EOF
util.loadDump('${local_backup_dir}',
{ threads: 4,
includeSchemas: [ 'production_db' ],
skipBinlog: true,
deferTableIndexes: 'all',
analyzeTables: 'on',
progressFile: '' })
EOF
threads
- Again, the number of parallel threads is something to experiment with and will differ for each machine.includeSchemas
- We tell MySQL shell to import the given schema from our local ”full backup“.skipBinLog
- This skips writing the import SQL queries to the binary log, should anyone have it enabled. This prevents unnecessary slowing down the import and taking up disk space.deferTableIndexes
- We observed the highest import speeds when we made the shell create indices after loading the data using this option.analyzeTables
- This option makes the shell call ANALYZE TABLE
for all tables after the import is done. This should speed up using the re-imported tables a bit and does not slow down the import process much.progressFile
- We don’t support temporarily interrupting the import process so we don’t need to track the progress of it.Partial imports - we never download and import data that we actually don’t need on the target database instance, as we explained above.
Deferring indexes creation - we found that creating indexes after loading the data speeds up the whole import about 2×. But this is dependent on the tables / indexes structure so we always recommend testing both variants.
Disabling binary logs - all our developers have them disabled anyway but if not, they are switched off during the import procedure.
Disabling the Redo log - this is something we haven’t talked about yet. Redo log is a disk-based log that is used during MySQL server crash recovery. Disabling it during the import speeds up the process tremendously, about 2-3×. But beware! Disabling this log makes your whole database server vulnerable to an irrecoverable loss of data in case the server crashes. You should never use this option if you have important data in your db server. Plus, you should always have a backup ready when using this.