-2

I would like to find and replace all forms of pow(var,2) occurring in the C++ files of my current directory with square(var).

I was looking through https://regexr.com/, but I am still not sure how I can describe var to regex. The complication is that var is a placeholder for any variable name which adheres to the following facts:

  1. It does not contain spaces
  2. It is bounded by pow( and ,2)
  3. It is composed of upper case letters [A-Z], lower case letters [a-z], and/or the underscore character _. Is there a canonical way to do such a refactoring in Linux?

Update 1 with Minimum Working Example:

Input:

pow(alpha,2) + pow(beta,2)
(3*pow(betaR_red,2))
2/pow(gammaBlue,3))
-pow(epsilon_gamma,2)+5

Desired Output:

square(alpha) + square(beta)
(3*square(betaR_red))
2/pow(gammaBlue,3))
-square(epsilon_gamma)+5

Update 2:

Here is a link to a follow-up question for which there are more solutions to performing this particular find and replace task.

procyon
  • 31
  • 7
  • you could start with `sed 's/pow(var,2)/square(var)/g' `; whether you run that against all files or just files known to contain the pattern is up to you; whether you overwrite the current files or create new files is up to you – markp-fuso Sep 24 '20 at 21:43
  • Thanks @markp-fuso, but the fact is that `var` is a placeholder for any variable name that adheres to the two facts above. I have made an edit to reflect the fact that it is a placeholder. – procyon Sep 24 '20 at 21:54
  • generally speaking: `sed "s/pow${var},2)/square(${var})/g" `; where you may get into trouble will be if the contents of your variable contains any 'special' characters that would affect the behavior of `sed` ... and if that's the case you'll need to provide a lot more detail (otherwise you'll get your question closed, again) – markp-fuso Sep 24 '20 at 22:00
  • @markp-fuso: Thanks once again. I've added more details about var. Specifically, it contains only `[A-Z]`, `[a-z]`, and/or `_`. – procyon Sep 24 '20 at 22:09
  • have you tried my second `sed` example? does it work? does it not work? update the question with your `sed` command, a sample variable (name and value), and the (incorrect) result you're getting and the desired (correct) result; also, consider reviewing [how do I ask a good question](https://stackoverflow.com/help/how-to-ask) and [how to create a minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) – markp-fuso Sep 24 '20 at 22:15
  • @markp-fuso I've added an MWE based on your advice and I also tried your second `sed` command but unfortunately it didn't work for me. Any further help would be appreciated. – procyon Sep 25 '20 at 01:36
  • `${var}` is referencing a variable ... a variable who's value you need to set; have you tried setting `var="alpha"` and then running the `sed` command? did it (not) work? it looks like you want to run the change against 4x different values (`alpha`,`beta`,`betaR`,`epsilon_gamma`); how are you tracking these 4x values ... 4x separate variables? in an array? going back to your original issue, how many files are you looking to process and how big are they? – markp-fuso Sep 25 '20 at 10:44
  • @markp-fuso: Thanks, I was hoping to avoid defining individual variables for all the types of text between `pow(` and `,2`. I have updated my question with a regex I found that may be helpful in defining `var`. I am looking to process about 30 *.cpp files in my directory and they are each about a couple of KB. Is the size and number of files generally an important factor when crafting a `sed` command? I thought just using a wild card would be enough. – procyon Sep 25 '20 at 15:35
  • the search patterns have to be stored somewhere; I just posted an answer that assumes the values are stored in an array; the proposed `sed/regex` is similar to what you've located but includes a means of using the array's values as a multi-pattern search (so you only have to run a single `sed` command against a given file); take the answer for a spin but keep in mind the provisos and assumptions mentioned at the beginning of the answer – markp-fuso Sep 25 '20 at 15:38
  • re: number of files and size of files ... ideally you want to minimize the number of times you have to open/search/write/close a file; with your data set (30x files @ few KB each) it's a minor issue but as the number of files and/or volume of data climbs you would start to see a noticeable difference in processing time if you approach this with a multi-pass (per file) solution in mind – markp-fuso Sep 25 '20 at 15:52
  • 1
    @markp-fuso: Thanks! I've accepted your answer as it works perfectly. I will try to use the principles contained in it to post another answer using `var="(?<=pow\().*?(?=,2\))"`. Thanks again for guiding me through this. – procyon Sep 25 '20 at 15:58

1 Answers1

1

Provisos and assumptions:

  • OP mentions needing to process multiple files; for this answer I'm going to focus on a single file; OP can start another question if issues arise with a multi-file solution
  • OP mentions wanting to replace some strings but it's not clear (to me) if the original file is to be overwritten or a new file is to be created; for this answer I'm going to focus on generating the 'modified' output; OP can expand on this solution (below) based on final requirements
  • examples seem to imply 4x different search patterns (alpha, beta, betaR_red, epsilon_gamma); I'm going to assume there could be a variable number of patterns that need to be searched for
  • for simplicity sake I'm going to assume the search patterns are stored in an array
  • search patterns contain no leading/trailing white space
  • search patterns are relatively simple and do not contain any special characters (eg, line feeds)

Sample input file:

$ cat input.txt
pow(alpha,2) + pow(beta,2)
(3*pow(betaR_red,2))
2/pow(gammaBlue,3))
-pow(epsilon_gamma,2)+5

Array of search patterns:

$ var=(alpha beta betaR_red epsilon_gamma 'double helix')
$ typeset -p var
declare -a var=([0]="alpha" [1]="beta" [2]="betaR_red" [3]="epsilon_gamma" [4]="double helix")

The general idea is to use sed to do a multi-pattern search of the file based on the contents of the var[] array. This means we need a way to reference the array in a manner that will be suitable for a sed multi-pattern match (ie, values need to be separated by a pipe (|).

By assigning IFS='|' we can 'reformat' the array contents to work as a multi-pattern search string for sed:

$ echo "${var[*]}"
alpha beta betaR_red epsilon_gamma double helix
$ IFS='|' varX="${var[*]}" ; echo "${varX}"
alpha|beta|betaR_red|epsilon_gamma|double helix

Which brings us to the sed command:

$ IFS='|' sed -E "s/pow\((${var[*]}),2\)/square(\1)/g" input.txt

Where:

  • sed -E - run with extended regex support
  • pow\( / ,2\) - search for our pow(..,2) string, escaping the parens so they are not evaluated as delimiters of a regex group
  • IFS='|' / (${var[*]}) - expand array var using '|' as value delimiter; by wrapping in parens this becomes our first (and only) search group
  • square( / ) - replacement string for pow( / ,2) pattern
  • \1 - copy contents of our search group, eg, if we matched on pow(beta,2) then \1 == beta

If we execute the above as set -xv ; IFS='|' sed ...; set +xv we will generate the following 'debug' output showing how the sed command is expanded with the values of the var array:

++ IFS='|'
++ sed -E 's/pow\((alpha|beta|betaR_red|epsilon_gamma|double helix),2\)/square(\1)/g' input.txt

The actual output of the above sed command:

square(alpha) + square(beta)          # 2x changes
(3*square(betaR_red))                 # 1x change
2/pow(gammaBlue,3))                   # no changes
-square(epsilon_gamma)+5              # 1x change
markp-fuso
  • 28,790
  • 4
  • 16
  • 36