10

I need to replace the LGPL license header in all of my Java source files with the Apache License 2.0 header, i.e. this

/*
 * Copyright (c) 2012 Tyler Treat
 * 
 * This file is part of Project Foo.
 *
 * Project Foo is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * Project Foo is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public License
 * along with Project Foo.  If not, see <http://www.gnu.org/licenses/>.
 */

needs to become

/*
 * Copyright (c) 2012 Tyler Treat
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 * 
 *  http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

I figured the easiest way would be to use sed to do a find and replace on all occurrences of this copyright header. I'm a bit of a Unix novice, so I was having problems getting the command working the way I needed it to -- specifically, dealing with the multiline strings. Basically, something like below, except the respective headers in place of foo and bar:

find . -name "*.java" -print | xargs sed -i 's/foo/bar/g'

I understand that sed works on one line at a time, so maybe there is a better solution altogether?

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
Tyler Treat
  • 14,640
  • 15
  • 80
  • 115
  • 1
    I know it may sound stupid, but don't forget to check if the text editor you use has a search-and-replace-in-files function that supports multi-line texts. – Jan Schejbal Jan 01 '13 at 02:15
  • @KevinBrown I don't think so... I retracted my close vote – rene Jun 15 '15 at 19:42
  • While this question was on topic when posted back in 2013, I am voting to close this question as it would be better suited on the [Unix & Linux SE](https://unix.stackexchange.com/) site. – Cole Tobin May 09 '20 at 16:02

3 Answers3

14
find . -name "*.java" -print0 | xargs -0 \
sed -i -e '/Project Foo is free software/,/along with Project Foo/c\
 * Licensed under the Apache License, Version 2.0 (the "License");\
 * you may not use this file except in compliance with the License.\
 * You may obtain a copy of the License at\
 *\
 *  http://www.apache.org/licenses/LICENSE-2.0\
 *\
 * Unless required by applicable law or agreed to in writing, software\
 * distributed under the License is distributed on an "AS IS" BASIS,\
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\
 * See the License for the specific language governing permissions and\
 * limitations under the License.'

The c command changes the range of lines to the specified text. The range is identified by the line containing 'Project Foo is free software' up to the line containing 'along with Project Foo'. The -i option to sed indicates GNU sed; therefore, I'm assuming that you've GNU find and xargs too, and used -print0 and -0 to avoid issues with blanks in file names etc.

For this, I might be tempted to put the sed script into a file (sed.script), which could then be used with:

find . -name "*.java" -exec sed -i -f sed.script {} +

This is neater, I think, but beauty is in the eye of the beholder.


Just one question: the alignment is a little off on the asterisks, is there some sort of whitespace character I need to use to indent them? I tried adding spaces to the replacement string but that seemed to have no effect.

Grrr...that's the sort of irritation I could do without (and you too). It seems that leading blanks on the 'change' data lines are dropped by sed. It seems to be sed rather than bash; I got the same result with ksh and also using a script file instead of the -e option on the command line. You can't edit the 'change' data as it is output.

One trick that would work — but you may not be keen on it:

$ cat sed.script
/Project Foo is free software/,/along with Project Foo/c\
 * Licensed under the Apache License, Version 2.0 (the "License");\
 * you may not use this file except in compliance with the License.\
 * You may obtain a copy of the License at\
 *\
 *  http://www.apache.org/licenses/LICENSE-2.0\
 *\
 * Unless required by applicable law or agreed to in writing, software\
 * distributed under the License is distributed on an "AS IS" BASIS,\
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\
 * See the License for the specific language governing permissions and\
 * limitations under the License.
$ s2p -f sed.script > perl.script
$ find . -name "*.java" -exec perl -f perl.script -i.bak {} +
$

The s2p program is a standard part of the Perl distribution which converts sed scripts into Perl scripts, but it preserves the leading spaces in the substitute data. I'm not keen on this, but the only alternative I can think of is making two passes through each file. The replacement data might be:

$ cat sed.script
/Project Foo is free software/,/along with Project Foo/c\
@*@ Licensed under the Apache License, Version 2.0 (the "License");\
@*@ you may not use this file except in compliance with the License.\
@*@ You may obtain a copy of the License at\
@*@\
@*@  http://www.apache.org/licenses/LICENSE-2.0\
@*@\
@*@ Unless required by applicable law or agreed to in writing, software\
@*@ distributed under the License is distributed on an "AS IS" BASIS,\
@*@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\
@*@ See the License for the specific language governing permissions and\
@*@ limitations under the License.
$

After doing the main text replacement, you'd then do:

$ find . -name "*.java" -exec sed -i 's/^@\*@/ */' {} +
$

This tracks down the lines starting @*@ and replaces that text with '*' (blank-star). Not as neat and tidy, but you aren't going to be doing this all that often, I trust.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks, this works! Just one question: the alignment is a little off on the asterisks, is there some sort of whitespace character I need to use to indent them? I tried adding spaces to the replacement string but that seemed to have no effect. – Tyler Treat Jan 01 '13 at 03:04
  • @TylerTreat Leading whitespace is preserved with GNU sed. You may want to use a different sed version, try some of the whitespace escape tips from http://www.grymoire.com/Unix/Sed.html#uh-43, or use a `r`ead instead. – Todd A. Jacobs Jan 01 '13 at 04:58
  • @CodeGnome: Interesting; I confirm that the Mac OS X (10.7.5) `sed` strips the leading blanks (somewhat to my surprise), but GNU `sed` does not. I've no idea why the Mac OS X version plays silly games with the leading blanks; I would not have predicted it at all. But using `/usr/bin/sed` and `/usr/gnu/bin/sed` on the same script file (`sed.script`) gives different results. Weird — easily arguably a bug in Mac OS X `sed`. – Jonathan Leffler Jan 01 '13 at 05:05
  • Yeah, I ended up just building GNU sed and using that. Thanks! – Tyler Treat Jan 01 '13 at 05:40
  • This is working excellently except that it's adding a ^M to the end of every line in the file. Any idea how to suppress that? – ColonelPackage May 16 '14 at 02:42
  • @ColonelPackage: There is nothing in the script that would introduce carriage returns AFAICS. My best guess would be that you that a Windows machine was involved somewhere along the line and the CR was introduced that way. Otherwise, check the script file with a hex dump program to see whether there are CR characters in it. If not, I am bemused. – Jonathan Leffler May 16 '14 at 03:12
6

Partial License Replacement Using GNU Sed

You can use GNU sed to solve this with some regular expression line matches and a read expression. Here are the steps.

Use a File to Hold Replacement Text

First, create a file to hold the replacement portion of your license:

cat << EOF > /tmp/license
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 * 
 *  http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
EOF

Run the Actual Sed Invocation

Next, run find to collect your file list, and invoke the following sed script to make the changes:

find . -name '*.java' |
xargs sed -i'' '/Copyright.*Tyler Treat/,/\*\// {
                    /Copyright/n
                    /\*\//r /tmp/license
                    d
                }'

Compatibility Note

This solution may or may not work with other versions of sed, but was tested locally and known to work with GNU sed version 4.2.1. If it doesn't work with the version of sed shipped with your edition of OS X, you can install GNU sed via MacPorts or similar.

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
  • The only quibble I have with this is that it seems to delete the formal copyright line, rather than leaving it in place. Otherwise, it is very close to what I proposed, and the difference is easily fixed. – Jonathan Leffler Jan 01 '13 at 02:39
  • @JonathanLeffler It must be your sed. The copyright line is preserved on my end 100% of the time as posted. Did you actually try it? – Todd A. Jacobs Jan 01 '13 at 04:41
  • I take it back; my mistake...it isn't the way I'd think of doing it, but the `n` in the action copies the Copyright line... – Jonathan Leffler Jan 01 '13 at 04:45
  • I really like that you did this by using a file as the text source. Awesome tip. – bhilburn Oct 10 '13 at 16:23
2

Assuming file1 contains your original text and file2 contains your replacement copyright comment:

awk 'f; /\*\//{system("cat file2");f=1}' file1

The above just looks for the first end-of-comment line in the original file and when it finds it cats the replacement file and turns on printing for the remainder of the original file.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • At least on my system, it adds the copyright notice (file2) after every single end-of-comment line, and not just the first one in the file. – MichaelM Nov 17 '15 at 00:12
  • Yes that's it working as designed. The posted example has 1 comment in the file. If there can be more than one (as you apparently have) then it's a different problem. If that's not the desired behavior then there's various tweaks, e.g. `f` -> `f{print;next}`, depending on whether the comment you want replaced is always the first one or some other criteria. – Ed Morton Nov 17 '15 at 13:27