0

I am trying to get a substring between &DEST= and the next & or a line break. For example :

  1. MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546

    In this I need to extract "SFO"

  2. MYREQUESTISTO8764GETTHIS&DEST=SANFRANSISCO&ORIG=6546

    In this I need to extract "SANFRANSISCO"

  3. MYREQUESTISTO8764GETTHISWITH&DEST=SANJOSE

    In this I need to extract "SANJOSE"

I am reading a file line by line, and I need to update the text after &DEST= and put it back in the file. The modification of the text is to mask the dest value with X character.

So, SFO should be replaced with XXX. SANJOSE should be replaced with XXXXXXX.

Output : MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=6546 MYREQUESTISTO8764GETTHIS&DEST=XXXXXXXXXXXX&ORIG=6546 MYREQUESTISTO8764GETTHISWITH&DEST=XXXXXXX

Please let me know how to achieve this in script (Preferably shell or bash script).

Thanks.

Puneet Jain
  • 97
  • 1
  • 10
  • 2
    Are you trying to [parse a query string](http://stackoverflow.com/questions/3919755/how-to-parse-query-string-from-a-bash-cgi-script)? – that other guy Aug 02 '16 at 17:48
  • Hi, Thanks for replying. I am actually reading a file line by line and trying to mask the text with X character. Please read my updated question. I just edited it. Thanks – Puneet Jain Aug 02 '16 at 20:32

4 Answers4

2
$ cat file
MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=PORTORICA
MYREQUESTISTO8764GETTHIS&DEST=SANFRANSISCO&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=SANJOSE
$ sed -E 's/^.*&DEST=([^&]*)[&]*.*$/\1/' file
SFO
PORTORICA
SANFRANSISCO
SANJOSE

should do it

sjsam
  • 21,411
  • 5
  • 55
  • 102
  • Thank you!!. I will try it. Also, I have edited my question a bit. Could you please read my updated questions? Thanks!! – Puneet Jain Aug 02 '16 at 20:29
1

Replacing airports with an equal number of Xs

Let's consider this test file:

$ cat file
MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=SANFRANSISCO&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=SANJOSE

To replace the strings after &DEST= with an equal length of X and using GNU sed:

$ sed -E ':a; s/(&DEST=X*)[^X&]/\1X/; ta' file
MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=XXXXXXXXXXXX&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=XXXXXXX

To replace the file in-place:

sed -i -E ':a; s/(&DEST=X*)[^X&]/\1X/; ta' file

The above was tested with GNU sed. For BSD (OSX) sed, try:

sed -Ee :a -e 's/(&DEST=X*)[^X&]/\1X/' -e ta file

Or, to change in-place with BSD(OSX) sed, try:

sed -i '' -Ee :a -e 's/(&DEST=X*)[^X&]/\1X/' -e ta file

If there is some reason why it is important to use the shell to read the file line-by-line:

while IFS= read -r line
do
   echo "$line" | sed -Ee :a -e 's/(&DEST=X*)[^X&]/\1X/' -e ta
done <file

How it works

Let's consider this code:

search_str="&DEST="
newfile=chart.txt
sed -E ':a; s/('"$search_str"'X*)[^X&]/\1X/; ta' "$newfile"
  • -E

    This tells sed to use Extended Regular Expressions (ERE). This has the advantage of requiring fewer backslashes to escape things.

  • :a

    This creates a label a.

  • s/('"$search_str"'X*)[^X&]/\1X/

    This looks for $search_str followed by any number of X followed by any character that is not X or &. Because of the parens, everything except that last character is saved into group 1. This string is replaced by group 1, denoted \1 and an X.

  • ta

    In sed, t is a test command. If the substitution was made (meaning that some character needed to be replaced by X), then the test evaluates to true and, in that case, ta tells sed to jump to label a.

    This test-and-jump causes the substitution to be repeated as many times as necessary.

Replacing multiple tags with one sed command

$ name='DEST|ORIG'; sed -E ':a; s/(&('"$name"')=X*)[^X&]/\1X/; ta' file
MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=XXXX
MYREQUESTISTO8764GETTHIS&DEST=XXXXXXXXXXXX&ORIG=XXXX
MYREQUESTISTO8764GETTHISWITH&DEST=XXXXXXX

Answer for original question

Using shell

$ s='MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546'
$ s=${s#*&DEST=}
$ echo ${s%%&*}
SFO

How it works:

  • ${s#*&DEST=} is prefix removal. This removes all text up to and including the first occurrence of &DEST=.

  • ${s%%&*} is suffix removal_. It removes all text from the first & to the end of the string.

Using awk

$ echo 'MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546' | awk -F'[=\n]' '$1=="DEST"{print $2}' RS='&'
SFO

How it works:

  • -F'[=\n]'

    This tells awk to treat either an equal sign or a newline as the field separator

  • $1=="DEST"{print $2}

    If the first field is DEST, then print the second field.

  • RS='&'

    This sets the record separator to &.

John1024
  • 109,961
  • 14
  • 137
  • 171
  • You might want `${s##*&DEST=}` ? for an i/p say `O8764GETTHIS&DEST&=DEST=SFO&ORIG=6546`. But perhaps that is not valid i/p is it? – Мона_Сах Aug 02 '16 at 18:13
  • @mona_sax OK. Could you clarify a bit? In your example, the string `&DEST=` never occurs. If that was a typo, do you anticipate cases where `&DEST=` is in the string _twice_? Why? If it is in the string twice, how would we know that we wanted the value of the last occurrence (as per your code) and not the first? – John1024 Aug 02 '16 at 18:19
  • Using shell, I like the solution. Because remember that I am reading a file line by line and if a has say : 'MYREQUESTISTO8764GETTHIS&DEST=SFO&ORIG=6546' than I need to extract SFO and replace it by 3 X characters like: 'MYREQUESTISTO8764GETTHIS&DEST=XXX&ORIG=6546' and replace the entire line with the modified line. ... If it is SANJOSE, then I need to replace it with 7 'X' character... .... So, I need something which can give me a modified line. – Puneet Jain Aug 02 '16 at 20:18
  • Thank you!!. I will try it. Also, I have edited my question a bit. Could you please read my updated questions? Thanks!! – Puneet Jain Aug 02 '16 at 20:30
  • @PuneetJain OK. See updated answer for a method to replace the airports with Xs. – John1024 Aug 02 '16 at 21:22
  • @John1024, Hi John, in order for your answer "sed -i -E ':a; s/(&DEST=X*)[^X&]/\1X/; ta' file" to work, what command should i have to read the file line by line. I mean, I can find out from other forums on how to read a file (say chart.txt) line by line, but I am asking your command which can be followed by above mentioned sed command. Thanks – Puneet Jain Aug 02 '16 at 21:34
  • @PuneetJain `sed` is good at reading line-by-line from a file. In case there is some reason why you need to use shell code to do that instead, I just added sample code for that. – John1024 Aug 02 '16 at 21:42
  • @John1024 When I am putting &DEST= into a variable, and using it, its not working. search_str="&DEST="; newfile=chart.txt; sed -i -E ':a; s/("$search_str"X*)[^X&]/\1X/; ta' "$newfile" Am i missing something else? Thanks for your help John !! – Puneet Jain Aug 02 '16 at 22:30
  • Try: `search_str="&DEST="; newfile=chart.txt; sed -E ':a; s/('"$search_str"'X*)[^X&]/\1X/; ta' "$newfile"` – John1024 Aug 02 '16 at 22:51
  • @John1024 Thanks John !! It worked. If you have time, would you mind telling me different part of the command, what they do etc, in case if I have some requirement change, then I can take your command as base and modify it accordingly. Thanks. Appreciate ur help. – Puneet Jain Aug 02 '16 at 23:55
  • @PuneetJain OK. I added an explanation: see the newly added "How it works" section. – John1024 Aug 03 '16 at 00:14
  • Thanks @John1024 . Sorry if I m bugging you.. but other than of a plain "&DEST=", i have to support "&DEST[7]="... basically square brackets have been added with a arbitray number inside it... and i have to mask the text after the "=" till the next "&". \n MYREQUESTISTO8764GETTHIS&DEST[3]=XXX&ORIG=6546 MYREQUESTISTO8764GETTHIS&DEST[12]=XXXXXXXXXXXX&ORIG=6546 MYREQUESTISTO8764GETTHISWITH&DEST[7]=XXXXXXX Something you can help with if its easy for you!! – Puneet Jain Aug 03 '16 at 03:46
  • In that case, use `search_str="&DEST(\[[[:digit:]]+\])?="` – John1024 Aug 03 '16 at 05:59
  • Thanks @John1024. Unfortunately it didn't worked. Both the cases didn't worked......This is what I did..... search_str=\&$name"(\[[[:digit:]]+\])?=" where $name can have values such as DEST or ORIG or HALT etc... – Puneet Jain Aug 03 '16 at 19:33
  • @John1024 One side effect I am seeing is that if the input is :"&DEST= SFO &" Than the output is "&DEST=XXXO &" instead of "&DEST= XXX &".... looks like something goof up in space counting. – Puneet Jain Aug 04 '16 at 08:16
  • @PuneetJain Replace `[^X&]` with `[^X &]` like this: `sed -E ':a; s/('"$search_str"'X*)[^X &]/\1X/; ta' "$newfile"` – John1024 Aug 04 '16 at 18:22
  • 1
    Thanks @John1024 it worked. Now one final last thing remaining is this-> If you could as well answer it, i would really really appreciate it.. I have search 100s of forums but nothing came close to this. http://stackoverflow.com/questions/38757920/parsing-line-and-changing-some-text-in-place – Puneet Jain Aug 05 '16 at 05:39
  • @John1024 Hi John... Looks like for the question in the other post, http://stackoverflow.com/questions/38757920/parsing-line-and-changing-some-text-in-place, somehow I am not able to add you to my reply.. not sure why.. so I am replying here, so that you can please answer my last question in that other post. Thanks. – Puneet Jain Aug 08 '16 at 20:59
  • @John1024 I know I am asking too much, but if you could correct my querry i wrote, i would really appreciate it.. its not working in couple of cases.. see this : http://stackoverflow.com/questions/38911200/change-string-in-file-between-two-strings-with-character-x – Puneet Jain Aug 19 '16 at 03:59
  • @John1024 Coming back to this original questions answer, you said, I can use this : search_str=\&$name"([[[:digit:]]+])?=" where $name can have different values from an array : like in a while loop, name can have DEST, and then it can have ORIG etc. Its working. But its not efficient, because if there are 10 strings like DEST/NAME/TAXI etc, then the sed command is being called multiple times. Can I write a sed command in which search_str=\&$names"([[[:digit:]]+])?=" and while $name will be (DEST|ORIG|TAXI|PLANE) etc? I tried it, but its not working. Please help. – Puneet Jain Aug 23 '16 at 22:02
  • @PuneetJain See the section entitled "Replacing multiple tags with one sed command" in the updated answer. – John1024 Aug 23 '16 at 22:20
1

With GNU bash:

while IFS= read -r line; do
  [[ $line =~ (.*&DEST=)(.*)((&.*|$)) ]] && echo "${BASH_REMATCH[1]}fooooo${BASH_REMATCH[3]}"
done < file

Output:

MYREQUESTISTO8764GETTHIS&DEST=fooooo&ORIG=6546
MYREQUESTISTO8764GETTHIS&DEST=fooooo&ORIG=6546
MYREQUESTISTO8764GETTHISWITH&DEST=fooooo
Cyrus
  • 84,225
  • 14
  • 89
  • 153
  • Thank you!!. I will try it. Also, I have edited my question a bit. Could you please read my updated questions? Thanks!! – Puneet Jain Aug 02 '16 at 20:29
-1

Replace the characters between &DEST and & (or EOL) with x's:

awk -F'&DEST='  '{
   printf("%s&DEST=", $1);
   xlen=index($2,"&");
   if ( xlen == 0) xlen=length($2)+1;
   for (i=0;i<xlen;i++) printf("%s", "X");
   endstr=substr($2,xlen);
   printf("%s\n", endstr);
   }' file
Walter A
  • 19,067
  • 2
  • 23
  • 43