Linux shell script to read from a file with separator and then manipulate some fields and redirect with updated fields in new file

Question

Need to read line by line a file example below and then get only and put it in separate file file 1

1~abc~select col1,col2,col3,col4 from <tablename> where <condition> group by~xyz
2~abc~select col1,col2,col3      from <tablename> where <condition> group by~xyq
3~abc~select col1,col2           from <tablename> where <condition> group by~xyg

new file expecting

1~abc~<condition>~xyz
2~abc~<condition>~xyq
3~abc~<condition>~xyg

is there any simpler way to get this done using shell script.

Pankaj Saini · Answer 1 · 2022-07-18T10:10:29.370

0

This can be achieved using sed utility in unix. A single expression is sufficient actually sed -e 's/select .* where //;s/ group by.*~/~/'

See it in action here:

pankaj@pankaj-mac TP % cat inputfile.txt 
1~abc~select col1,col2,col3,col4 from <tablename> where <condition> group by~xyz
2~abc~select col1,col2,col3      from <tablename> where <condition> group by~xyq
3~abc~select col1,col2           from <tablename> where <condition> group by~xyg

pankaj@pankaj-mac TP % sed -e 's/select .* where //;s/ group by//' inputfile.txt > newfile.txt
pankaj@pankaj-mac TP % cat newfile.txt 
1~abc~<condition>~xyz
2~abc~<condition>~xyq
3~abc~<condition>~xyg

pankaj@pankaj-mac TP %

edited Jul 18 '22 at 10:10

answered Jul 16 '22 at 06:11

Pankaj Saini

1,164
1
5
4

1

You might want to use a single regex with a back reference. Perhaps prefer `[^~]` to avoid letting the match straddle field boundaries. `sed 's/select [^~]* where \([^~]*\) group by[^~]*/\1/'` – tripleee Jul 16 '22 at 06:37
I tried Pankaj Saini solution and it worked thanks a lot for your time and effort – OrangeGallery Jul 17 '22 at 18:21
One additional stuff , how about if group by , order by have values as shown below. forgot to add in previous one 1~abc~select col1,col2,col3,col4 from where group by col1,col2,col3,col4 order by col1,col2,col3,col4~xyz 2~abc~select col1,col2,col3 from where group by col1,col2,col3 order by col1,col2,col3~xyq 3~abc~select col1,col2 from where group by col1,col2 order by col1,col2~xyg Need output like 1~abc~~xyz 2~abc~~xyq 3~abc~~xyg – OrangeGallery Jul 18 '22 at 08:43
Edit my response based on this additional input. Also the solution proposed by tripleee will work.. – Pankaj Saini Jul 18 '22 at 10:11

ufopilot · Answer 2 · 2022-07-16T14:21:56.043

Using GNU Awk

$ awk -F'~| where | group ' '$0=$1"~"$2"~"$4"~"$NF' file 
$ awk -F'~| where | group ' -v OFS='~' '{print $1,$2,$4,$NF}' file
$ awk -F '~' -v OFS='~' 'gsub(/.* where | group by.*/,"",$3)' file
$ awk '$0=gensub(/(.*)select .* where (.*) group by(.*)/,"\\1\\2\\3",1)' file
$ awk 'match($0,/(.*)select.*where (.*) group by(.*)/,a){$0=a[1]a[2]a[3]}1' file 

1~abc~<condition>~xyz
2~abc~<condition>~xyq
3~abc~<condition>~xyg

tripleee · Answer 3 · 2022-07-16T06:39:27.540

In the slightly more general case where the regexes could also match within some of the other fields, Awk might be a better choice than sed.

awk -F '~' '{
  sub(/.* where /, "", $3);
  sub(/ group by.*/, "", $3)
 }1' file >newfile

The final 1 is the common Awk idiom for "print the current input unconditionally".

This could also be accomplished with a while read -r loop in the shell, but you generally want to avoid that; see Bash while read loop extremely slow compared to cat, why?

Linux shell script to read from a file with separator and then manipulate some fields and redirect with updated fields in new file

3 Answers3