1

I read the following article "Using grep and sed to find and replace a string" but how can I extend it to chain multiple greps. For example I have the following directory/file structure

dir1/metadata.txt
dir2/metadata.txt

dir1/metadata.txt has

filename1 '= 1.0.0'
filename2 '= 1.0.0'

dir2/metadata.txt has

filename1     '= 1.0.0'
long_filename '= 1.0.0'

In other words, both dir1/metadata.txt and dir2/metadata.txt contain "filename '1.0.0'" but the spaces between the "filename" and the "'1.0.0'" in each file is different.

Now I want to replace filename1's associated version to '2.0.0' in ALL metadata.txt files so the resulting files look like...

dir1/metadata.txt has

filename1 '= 2.0.0'
filename2 '= 1.0.0'

dir2/metadata.txt has

filename1     '= 2.0.0'
long_filename '= 1.0.0'

I'm trying

find . -name metadata.txt | xargs grep filename1 | sed -i "s/1\.0\.0/2.0.0/g" <some option here>

but I know the "some option here" part. Any clues?

Community
  • 1
  • 1
Chris F
  • 14,337
  • 30
  • 94
  • 192
  • do you need to change all `*filename*` ( e.g. `filename2` and `long_filename` ) too or only `filename1` ? – tivn May 02 '15 at 03:56
  • tivn: Only filename1. filename2 and long_filename remain unchanged – Chris F May 02 '15 at 03:59
  • shelter: your command will only change ONE file – Chris F May 02 '15 at 04:00
  • 1
    sed simply CANNOT operate on strings but see http://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed/29626460#29626460 for a workaround. – Ed Morton May 02 '15 at 04:52
  • @EdMorton: Good tip in general, but the issue here is not how to _generically_ replace strings with `sed`, but how to combine `find` with `sed` (with _specific_ search and replacement strings, for which the OP has already provided the required escaping). – mklement0 May 02 '15 at 05:13
  • I understand but I thought it was worth mentioning given the subject line. – Ed Morton May 03 '15 at 11:51

2 Answers2

4

Try the following:

Linux:

find . -name metadata.txt \
  -exec sed -i "s/^\(filename1[[:space:]]\{1,\}'= \)1\.0\.0/\12.0.0/" {} +

OSX / BSD:

find . -name metadata.txt \
  -exec sed -i '' "s/^\(filename1[[:space:]]\{1,\}'= \)1\.0\.0/\12.0.0/" {} +

Note: The only reason why platform-specific commands are required is that GNU sed and BSD sed interpret the nonstandard -i option, which specifies the suffix to use for an optional backup of the original file, differently: GNU sed considers the option-argument for -i optional, whereas BSD sed considers it mandatory, requiring an explicit argument to specify the empty string (indicating the desire not to create a backup file)

  • exec ... + is a find feature that invokes the specified command with as many matching paths as can fit on a single command line, potentially resulting in multiple invocations, but typically resulting in only 1, which makes the invocation efficient.

  • "s/\(filename1[[:space:]]\{1,\}'= \)1\.0\.0/\12.0.0/" is a POSIX-compliant sed script that matches literal filename1 at the beginning of a line, followed by a variable amount of whitespace ([[:space:]]\{1,\}), followed by literal '= 1.0.0, and replaces the 1.0.0. with 2.0.0.

  • Note that if there are metadata.txt files that do not have lines beginning with filename1, they are still rewritten, because sed's -i option blindly "updates" the input files given (read: creates a new file that then replaces the original). If that is undesired, consider John1024's answer.

POSIX-compliance notes:

  • The -exec ... + variant of find's -exec primary has been part of POSIX since 2001 (POSIX.1-2001 / IEEE Std 1003.1-2001 / SUS v3 - see http://pubs.opengroup.org/onlinepubs/009695399/; thanks, @JonathanLeffler)
  • By contrast, sed's -i option for in-place updating is not POSIX-compliant - so you may have to work around that.
Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    Use `+` in place of `\;`? – Jonathan Leffler May 02 '15 at 04:14
  • According to [Apple's `man find`](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/find.1.html), its current version of `find` supports `{} +`. Of course, there will be other or older BSD systems that don't support it. – John1024 May 02 '15 at 04:18
  • Thanks, @John1024: `find`'s `-exec .... +` _is_ POSIX-compliant (see http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html), but `sed`'s `-i` is not. – mklement0 May 02 '15 at 04:20
  • The `+` is from POSIX 2004 ([`find`](http://pubs.opengroup.org/onlinepubs/009695399/utilities/find.html)), so those systems that don't support it are at least a decade out of date. That isn't to say they don't exist, nor that it isn't worth mentioning the fallback. – Jonathan Leffler May 02 '15 at 04:21
  • 1
    @mklement0 Yes, very good. Plus 1 for both GNU and BSD versions. – John1024 May 02 '15 at 04:22
  • For RH Linux, how do I change it so I don't care if there is a comma (",") between filename1 and '1.0.0', and it can be anywhere; IOW "filename1, '1.0.0'" or "filename1 ,'1.0.0'" – Chris F May 02 '15 at 04:52
  • @JonathanLeffler: Thanks for the POSIX hint; from what I understand, POSIX 2004 is the same as POSIX.1-2001 (IEEE Std 1003.1-2001) - SUS v3, only with _technical corrections_; thus, we're talking 14 years ago - is that your understanding too? – mklement0 May 02 '15 at 04:53
  • 1
    Yup: the 'title page' for the [POSIX 2004](http://pubs.opengroup.org/onlinepubs/009695399/) documentation says: ***Abstract:*** _The 2004 edition incorporates Technical Corrigendum Number 1 and Technical Corrigendum 2 addressing problems discovered since the approval of the 2001 edition. These are mainly due to resolving integration issues raised by the merger of the Base documents._ So there is room to argue that manufacturers who have not gotten around to adding `+` to `find` have had all of thirteen years (2002-2014) and bits of 2001 and 2015 in which to fix the issue. – Jonathan Leffler May 02 '15 at 04:59
  • @ChrisF: The simplest approach is to replace `[[:space:]]\{1,\}` with `[[:space:],]\{1,\}`, but this is slightly more permissive than what you asked for; if you know that there is _at most_ one comma, this should work, however. – mklement0 May 02 '15 at 05:01
3
find . -name metadata.txt -exec grep -l --null filename1 {} + | xargs -0 sed -i "/^filename1 /{s/'= 1\.0\.0'/'= 2.0.0'/;}"

sed -i will update the timestamp of every file it processes regardless of whether it changes the contents of the file. This is because, in operation, sed -i creates a new file for each file processed and then overwrites the old file with the new file. To limit this, the above code uses grep to select only the files that might need modification and sends only those file names, via a pipeline, to sed -i for the update.

If the timestamp/overwriting issue is not important, consider mklement0's answer which eliminates the need for a pipeline, simplifying the command.

How it works

  • find . -name metadata.txt -exec grep -l --null filename1 {} +

    This produces the list of files name metadata.txt that also contain filename.

    The --null tells grep to separate file names with the NUL character.

  • xargs -0 sed -i "/^filename1 /{s/'= 1\.0\.0'/'= 2.0.0'/;}"

    This applies sed -i to change in-place the files whose names were returned by the above find command.

    In more detail:

    • /^filename1 /

      This selects lines that start with filename1 followed by a space. This assures that we match neither sfilename1 nor filename12.

    • s/'= 1\.0\.0'/'= 2.0.0'/

      This changes the version number for the selected lines. (This assumes only one space after the equal sign. If this assumption is not correct, we can easily change it.)

    The -0 option to xargs tells it to expect its input to be a NUL-separated list of file names. This makes the pipeline safe even if the file names include spaces, newlines, or other difficult characters.

Community
  • 1
  • 1
John1024
  • 109,961
  • 14
  • 137
  • 171
  • 1
    Be aware that while the `-exec` handles spaces in file names, the `xargs` won't. You can fix that with the GNU toolchain by using `grep -lZ` and `xargs -0` so that the file names are terminated with a null byte (instead of a newline). Alternatively, you can execute the `sed` in the `-exec` option. The downside of using `-exec` is that it might edit a file which does not contain `filename1`. That's relatively unlikely to matter, even if there are thousands of files to process, unless there are reasons not to risk modifying the 'last changed time' of the files unless something actually changes. – Jonathan Leffler May 02 '15 at 05:13
  • The 'might modify files that don't need modifying' observation applies to the other answer. I like the simpler 'match the marker; substitute the relevant text on the marked line' operation in the `sed` script. I don't understand why people insist on using a single line for shell scripts — 'one-liner' is a pejorative term in APL. – Jonathan Leffler May 02 '15 at 05:17
  • @JonathanLeffler Thanks. I updated the answer to include `-Z`/`-0`. I kept the grep-to-sed pipeline because the `last changed time` issue does surprise/confuse users who aren't expecting it. Separately, I found well-written APL scripts to be quite readable. I wouldn't mind if shell tools were rewritten by someone with Ken Iverson's eye for logical consistency. – John1024 May 02 '15 at 06:01
  • John, @JonathanLeffler: Good point re timestamps (++); I've updated my answer to mention the issue and linked to this one as an alternative. John, perhaps you can update your answer to explain when your answer is preferable over mine. – mklement0 May 02 '15 at 15:26
  • John: If you change `-Z` to `--null`, the command will work on BSD systems (including OSX) too. @JonathanLeffler: The POSIX-compliant solution to handling embedded spaces (assuming there are no filenames with _newlines_) is to use `xargs -I {} sed ... {}`, but note that that inevitably invokes `sed` once for every file. – mklement0 May 02 '15 at 15:31
  • 1
    Note that BSD `grep` has a `-Z` option, but it is wholly different from the GNU `grep -Z`: it makes it work like `zgrep` (so it searches compressed files too). – Jonathan Leffler May 02 '15 at 15:38
  • 1
    @mklement0 I updated the answer to `--null`, mentioned the issue with timestamps, and linked back to your answer. – John1024 May 02 '15 at 18:16