I am working on a webtool to mirror a Wordpress installation into a development system.
The aim is to have a Live system for production and a development system for testing. The webtool then offers a one-click-sync between those systems. Each of the systems is standalone, with its own webroot, database and url.
I am having a trouble with the database dump in which I have to search all the references to the source and replace them with the URL of the destination (e.g.: "www.example.com" -> "www-dev.example.com").
What I need to do is:
Find all occurences of the URL and replace it with the new one.
IF the match also matches the format of a serialized string it should set the Field-Seperator, and reload the match, so that the actual length can be set in the array.
In a first attempt I tried to solve this with a 'sed' command looking as follows: sed -i.orig 's/360\.example\.com/360-dev\.my\.example\.dev/g'
.
This didn't work because there are serialized arrays contained in the dump, containing the url. The sed
command is no good for updating the string-length-indicator of the serialized arrays.
My latest attempt is to use an awk
as suggested here, because it's capable of arithmetic operations.
My awk
script looks like this:
/360[.]example[.]com/ {
sub("360.example.com", "360-dev.my.example.dev");
if ($0 ~ /s:[[:digit:]]+:["](http[s]?:\/\/)?360[.]example[.]com["]/){
FS="\"";
$0=$0;
n=length($2)-1;
sub(/:[[:digit:]]+:/, ":" n ":");
}
} 1
There seem to be some errors in my script, which I can't find. It does not replace all of the occurrences of the url and completely skips the length-indicator-update.
How can I fix my script to achieve what I want to do?
EDIT: (Added Input/Output samples)
Databasedump consists of the whole wordpress-database with CREATE TABLE IF NOT EXISTS
and INSERT
statements for each table and record.
Normal (unserialized) occurence:
(36, 'home', 'http://360.example.com/blogname', 'yes'),
should result in:
(36, 'home', 'http://360-dev.my.example.dev/blogname', 'yes'),
Serialized occurence:
(404, 'wp-maintenance-mode', 'a:21:{s:6:"active";i:1;s:4:"time";i:0;s:4:"link";i:1;s:7:"support";i:0;s:10:"admin_link";i:1;s:7:"rewrite";s:0:"";s:6:"notice";i:1;s:4:"unit";i:1;s:5:"theme";i:0;s:8:"styleurl";s:69:"http://360.example.com/wp-content/themes/blogname/css/maintenance.css";s:5:"index";i:0;s:5:"title";s:0:"";s:6:"header";s:0:"";s:7:"heading";s:0:"";s:4:"text";s:12:"Example Text";s:7:"exclude";a:1:{i:0;s:0:"";}s:6:"bypass";i:0;s:4:"role";a:1:{i:0;s:13:"administrator";}s:13:"role_frontend";a:1:{i:0;s:13:"administrator";}s:5:"radio";i:0;s:4:"date";s:0:"";}', 'yes'),
Should result in:
(404, 'wp-maintenance-mode', 'a:21:{s:6:"active";i:1;s:4:"time";i:0;s:4:"link";i:1;s:7:"support";i:0;s:10:"admin_link";i:1;s:7:"rewrite";s:0:"";s:6:"notice";i:1;s:4:"unit";i:1;s:5:"theme";i:0;s:8:"styleurl";s:76:"http://360-dev.my.example.dev/wp-content/themes/blogname/css/maintenance.css";s:5:"index";i:0;s:5:"title";s:0:"";s:6:"header";s:0:"";s:7:"heading";s:0:"";s:4:"text";s:12:"Example Text";s:7:"exclude";a:1:{i:0;s:0:"";}s:6:"bypass";i:0;s:4:"role";a:1:{i:0;s:13:"administrator";}s:13:"role_frontend";a:1:{i:0;s:13:"administrator";}s:5:"radio";i:0;s:4:"date";s:0:"";}', 'yes'),
EDIT 2:
Now using wp-cli
to do the task of search & replace.
I've got a multisite setup with blogs numbered (2,3,9).
Executing wp search-replace --url=360.example.com '360.example.com' '360-dev.my.example.dev'
results in an error, telling me that the Single-Site tables (wp_redirection_items and wp_redirection_groups) cannot be found.
This is true, because they really do not exist, but rather for each blog (e.g: wp_2_redirection_items and so on). This error results in over 9000 missed occurences in s&r. It's possible to replace everything with wp search-replace --url=360.example.com '360.example.com' '360-dev.my.example.com' wp_*
. But it still throws the error.