Introduction
At work we have to deal with very large mysqldump
backups of our production MySQL databases (sometimes they can be over 10 GBs large). From time to time we have to restore single tables in our local development environment. Importing the entire database is out of question because this operation can take up to 40 minutes to complete. Manually editing the SQL dump is also impossible because opening such large files will likely kill any text editor.
I’m aware that we can use mysqldump to dump only the content of our table
$ mysqldump -u [user] -p[password] [database] [table] > [output].sql
or use more appropriate tools such as mydumper, but we were wondering if this was possible using our existing backup files generated by AWS.
Turns out that this task is really straightforward using the Unix sed tool.
SED
Sed stands for stream editor and it is a non-interactive tool that can perform basic text transformations on an input stream. It is often used in combination with piped inputs or outputs.
Sed offers several operations, but the most commonly used are
- delete
- substitute
In our case we only want to send to stdout
all the lines starting from CREATE TABLE <table>
up to the next CREATE TABLE
statement. Those lines will also contain all the INSERT INTO <table>
instructions.
The Command
First, let’s look at the final command and then we will try to break it out:
$ sed -n -e '/CREATE TABLE.*`table`/,/CREATE TABLE/p' dump.sql > table.sql
- the option
-n
disables automatic printing (equals to--quiet
or--silent
). The command will not print anything unless an explicit request to print is foundp
. - the option
-e
(equivalent to--expression=script
) indicates to Sed that the next string is a command. We can omit that if the string contains only one command.
To determine which lines we want to target we can use Sed addresses. Our address is in the form /PATTERN/,/PATTERN/
, which means that we are targeting a range of lines (we have a comma in the middle). Of course /PATTERN/
is any regular expression.
Therefore the given RegExp will instruct sed to print all the lines between CREATE TABLE table
and the next CREATE TABLE
occurrence from the dump.sql
file into a new table.sql
file.
Notice:
The RegExp is matched in a case-sensitive manner. If you want to disable that you can use the I
modifier /regexp/I
The command can be further break down to the following example, which shows more clearly the behaviour of sed
$ seq 9 | sed -n '/1/,/3/p'
This will print only numbers from 1 to 3
1
2
3
Take note that the command will also include the second pattern in the final output. Therefore some minor adjustments will still be necessary, more importantly to make sure that the final output does not contain any DROP
statements.