3

Below is an example snippet from a sql dump file. This specific row contains a meta_value of a Wordpress PHP serialized array. During database restores in dev., test., and qc. environments I'm using sed to replace URLs with the respective environment sub-domain.

INSERT INTO `wp_postmeta`
    (`meta_id`,
     `post_id`,
     `meta_key`,
     `meta_value`)
VALUES
    (527,
     1951,
     'ut_parallax_image',
     'a:4:{
          s:17:\"background-image\";
          s:33:\"http://example.com/background.jpg\";
          s:23:\"mobile-background-image\";
          s:37:\"www.example.com/mobile-background.jpg\";
      }')
;

However, I need to extend this to correct the string length in the serialized arrays after replace.

sed -r -e "s/:\/\/(www\.)?${domain}/:\/\/\1${1}\.${domain}/g" "/vagrant/repositories/apache/$domain/_sql/$(basename "$file")" > "/vagrant/repositories/apache/$domain/_sql/$1.$(basename "$file")"

The result should look like this for dev.:

INSERT INTO `wp_postmeta`
    (`meta_id`,
     `post_id`,
     `meta_key`,
     `meta_value`)
VALUES
    (527,
     1951,
     'ut_parallax_image',
     'a:4:{
          s:17:\"background-image\";
          s:37:\"http://dev.example.com/background.jpg\";
          s:23:\"mobile-background-image\";
          s:41:\"www.dev.example.com/mobile-background.jpg\";
      }')
;

I'd prefer to not introduce any dependancies other than sed.

Seth Reeser
  • 195
  • 2
  • 2
  • 13

5 Answers5

3

Thanks @John1024. @Fabio and @Seth, I not sure for perfomance, but these code work and without wp-cli:

localdomain=mylittlewordpress.local
maindomain=strongwordpress.site.ru
cat dump.sql | sed 's/;s:/;\ns:/g' | awk -F'"' '/s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' | sed ':a;N;$!ba;s/;\ns:/;s:/g' | sed "s/$maindomain/$localdomain/g" | mysql -u$USER -p$PASS $DBNAME

PHP serialized string exploded by ';s:' to multiline string and awk processed all lines by @John1024 solution.

cat dump.sql | sed 's/;s:/;\ns:/g'

Redirect output to awk

awk -F'"' '/^s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1'

After all lines processed, multiline implode to one line (as then exists in original dump.sql). Thanks @Zsolt https://stackoverflow.com/a/1252191

sed ':a;N;$!ba;s/;\ns:/;s:/g'

Addition sed replacement need for any other strings in wordpress database.

sed "s/$maindomain/$localdomain/g"

And load into main server DB

... | mysql -u$USER -p$PASS $DBNAME
  • May work fine for simple strings, but this will not handle multidimensional arrays. – Tim Mishutin May 31 '21 at 10:28
  • This answer is great and will undoubtedly run with better performance than a PHP implementation. It's memory safe too – Richard Tyler Miles Oct 26 '22 at 20:42
  • Thanks for linking @Zsolt's post as an Edit made, noting [@Benjie comment](https://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed?page=1&tab=votes#comment9175314_1252191) which gives a second option for multiline implode `sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g'` which works on MAC OS and RHEL – Richard Tyler Miles Oct 27 '22 at 01:31
  • You may be able to [speed up the awk command](https://serverfault.com/questions/1114188/php-serialize-awk-command-speed-up/1114191#1114191) by updating related libraries. – Richard Tyler Miles Oct 27 '22 at 23:29
1

Your algorithm involves arithmetic. That makes sed a poor choice. Consider awk instead.

Consider this input file:

$ cat inputfile
  something...
  s:33:\"http://example.com/background.jpg\";
  s:37:\"www.example.com/mobile-background.jpg\";
  s:33:\"http://www.example.com/background.jpg\";
  more lines...

I believe that this does what you want:

$ awk -F'"' '/:\/\/(www[.])?example.com/ {sub("example.com", "dev.example.com"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' inputfile 
  something...
  s:37:\"http://dev.example.com/background.jpg\";
  s:37:\"www.example.com/mobile-background.jpg\";
  s:41:\"http://www.dev.example.com/background.jpg\";
  more lines...
John1024
  • 109,961
  • 14
  • 137
  • 171
  • Thank you for your answer - unfortunately this slightly missed the point, I'm looking to count the match between \" "\ and update the s:## number. – Seth Reeser Apr 27 '15 at 20:57
  • @SethReeser OK. I see now. You really don't want to do arithmetic with sed. Please consider the `awk` solution in the updated answer. awk is a standard part of the Unix tool set. – John1024 Apr 27 '15 at 21:09
  • I'm running into instances where this isn't working when the array is on the same line – Seth Reeser Apr 28 '15 at 02:44
  • @SethReeser Would you provide an example of when "the array is on the same line"? (Don't try to put it in a comment: comments format poorly. Add as an update to the bottom of the existing question.) – John1024 Apr 28 '15 at 03:19
  • thanks for all of your help, I ended up going with WP-CLI http://wp-cli.org/ as it has a built in command search-replace that handles this exact scenario. – Seth Reeser May 01 '15 at 14:25
1

WP-CLI handles serialized PHP arrays during a search-replace http://wp-cli.org/commands/search-replace/. I wanted to try a native shell solution, but having WP-CLI was worth the extra overhead in the end.

Seth Reeser
  • 195
  • 2
  • 2
  • 13
  • Well, I'm trying to build a test environment of a site and I want to take a new database dump from production every day. The database size is 3gb. When I run `wp search-replace`, I have to wait some hours to get 1446982 replacements. A shell solution would be killer for big databases. I tryed, but its not so simple to replace strings on serialized php arrays. – Fabio Montefuscolo Oct 09 '15 at 23:03
0

Here is a sample text file you asked for (it's a database export).

Original (https://www.example.com) :

LOCK TABLES `wp_options` WRITE;
INSERT INTO `wp_options` VALUES (1,'siteurl','https://www.example.com','yes'),(18508,'optionsframework','a:48:{s:4:\"logo\";s:75:\"https://www.example.com/wp-content/uploads/2014/04/logo_imbrique_small3.png\";s:7:\"favicon\";s:62:\"https://www.example.com/wp-content/uploads/2017/04/favicon.ico\";}','yes')
/*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
UNLOCK TABLES;

Result needed (http://example.localhost) :

LOCK TABLES `wp_options` WRITE;
INSERT INTO `wp_options` VALUES (1,'siteurl','http://example.localhost','yes'),(18508,'optionsframework','a:48:{s:4:\"logo\";s:76:\"http://example.localhost/wp-content/uploads/2014/04/logo_imbrique_small3.png\";s:7:\"favicon\";s:64:\"https://example.localhost/wp-content/uploads/2017/04/favicon.ico\";}','yes');
/*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
UNLOCK TABLES;

As you can see :

  • there is multiple occurence on the same line
  • escape characters aren't counted in length number (eg: "/")
  • some occurence aren't preceded by "s:" length number (no need to replace, it can be done after awk with a simple sed)

Thanks in advance !

Damien
  • 1
  • 3
0

@Alexander Demidov's answer is great, here's our implementation for reference

public static function replaceInFile(string $replace, string $replacement, string $absoluteFilePath): void
{

    ColorCode::colorCode("Attempting to replace ::\n($replace)\nwith replacement ::\n($replacement)\n in file ::\n(file://$absoluteFilePath)", iColorCode::BACKGROUND_MAGENTA);

    $replaceDelimited = preg_quote($replace, '/');

    $replacementDelimited = preg_quote($replacement, '/');

    $replaceExecutable = CarbonPHP::CARBON_ROOT . 'extras/replaceInFileSerializeSafe.sh';

        // @link https://stackoverflow.com/questions/29902647/sed-match-replace-url-and-update-serialized-array-count
    $replaceBashCmd = "chmod +x $replaceExecutable && $replaceExecutable '$absoluteFilePath' '$replaceDelimited' '$replace' '$replacementDelimited' '$replacement'";

    Background::executeAndCheckStatus($replaceBashCmd);

}

public static function executeAndCheckStatus(string $command, bool $exitOnFailure = true): int
{

    $output = [];

    $return_var = null;

    ColorCode::colorCode('Running CMD >> ' . $command,
        iColorCode::BACKGROUND_BLUE);

    exec($command, $output, $return_var);

    if ($return_var !== 0 && $return_var !== '0') {

        ColorCode::colorCode("The command >>  $command \n\t returned with a status code (" . $return_var . '). Expecting 0 for success.', iColorCode::RED);

        $output = implode(PHP_EOL, $output);

        ColorCode::colorCode("Command output::\t $output ", iColorCode::RED);

        if ($exitOnFailure) {

            exit($return_var);

        }

    }

    return (int) $return_var;

}
    
#!/usr/bin/env bash

set -e

SQL_FILE="$1"

replaceDelimited="$2"

replace="$3"

replacementDelimited="$4"

replacement="$5"

if ! grep --quiet "$replace" "$SQL_FILE" ;
then

  exit 0;

fi

cp "$SQL_FILE" "$SQL_FILE.old.sql"

# @link https://stackoverflow.com/questions/29902647/sed-match-replace-url-and-update-serialized-array-count
# @link https://serverfault.com/questions/1114188/php-serialize-awk-command-speed-up/1114191#1114191
sed 's/;s:/;\ns:/g' "$SQL_FILE" | \
  awk -F'"' '/s:.+'$replaceDelimited'/ {sub("'$replace'", "'$replacement'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' 2>/dev/null  | \
  sed -e ':a' -e 'N' -e '$!ba' -e 's/;\ns:/;s:/g' | \
  sed "s/$replaceDelimited/$replacementDelimited/g" > "$SQL_FILE.replaced.sql"

cp "$SQL_FILE.replaced.sql" "$SQL_FILE"