0

I have an input data file (data.csv) with the below format and I need to update it to get my result

ABC,SOME CONTENT,XYZ,SOME CONTENT,,#AB DEF,,,,,,,,,,,0.1,,
ABC,SOME CONTENT,XY1,SOME CONTENT,,#AB DEF,,,,,,,,,,,0.1,,
ABC,SOME CONTENT,XY2,SOME CONTENT,,#AB DEF,,,,,,,,,,,0.1,,
ABC,SOME CONTENT,XY3,SOME CONTENT,,#AB DEF,,,,,,,,,,,0.1,,

My condition is I have a file (input.txt) with the below content

XYZ
XY3

I need to check the records in data.csv for each value in input.txt and update the value in the column from #AB DEF to #in some country and update the value from 0.1 to 0 only for the corresponding records and place it back in the same file.

Result:

ABC,SOME CONTENT,**XYZ**,SOME CONTENT,,**#in some country**,,,,,,,,,,,**0**,,
ABC,SOME CONTENT,**XY1**,SOME CONTENT,,**#AB DEF**,,,,,,,,,,,**0.1**,,
ABC,SOME CONTENT,**XY2**,SOME CONTENT,,**#AB DEF**,,,,,,,,,,,**0.1**,,
ABC,SOME CONTENT,**XY3**,SOME CONTENT,,**#in some country**,,,,,,,,,,,**0**,,

I tried with the below command (I am not sure how to check the condition within the sed command), it updates all the records, I know it is because of g at the end, but not sure how to limit it only for the records which match my condition of XYZ

sed -i "s|#AB DEF,,,,,,,,,,,0.1,,|#in some country,,,,,,,,,,,0,,|g" data.csv

I have to have # and spaces within the data, I tried to put the data into variables and tried to replace it, but sed is identifying there is a special character and failing.

Dileep
  • 126
  • 1
  • 4
  • 14
  • 1
    Are the asterisks (`*`) in your input and output meant to highlight the words? Can I assume that the data to match and to be replaced are in fixed columns/fields? Please clarify this in your question. Is it necessary to check that column 6 has the value `#AB DEF` or can it be replaced with `#in some country` unconditionally if column 3 matches one of the values in `input.txt`? Same question for the `0.1` value. – Bodo Sep 01 '20 at 11:22
  • @Bodo, yes the * are the contents, I wanted to highlight. I didn't know why Stackoverflow didn't like my approach of highlighting. Yes, it has to be matched for the data present and replace it – Dileep Sep 02 '20 at 10:29
  • **Please [edit] your question** and add all requested information or clarification there instead of answering in comments. You can't use `**` for highlighting in code blocks. Additional question: Your `sed` script replaces the string `#AB DEF,,,,,,,,,,,0.1,,` with `#in some country,,,,,,,,,,,0,,`, so it requires the exact combination of `#AB DEF` followed by 10 empty values, followed by `0.1`, followed by 2 empty values. Is this your requirement? Or should `#AB DEF` in column 6 be replaced independently from the values in columns 7 and following? – Bodo Sep 02 '20 at 10:41

3 Answers3

4
$ awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1]; next} $3 in a{$6="#in some country"; $17=0} 1' input.txt data.csv
ABC,SOME CONTENT,XYZ,SOME CONTENT,,#in some country,,,,,,,,,,,0,,
ABC,SOME CONTENT,XY1,SOME CONTENT,,#AB DEF,,,,,,,,,,,0.1,,
ABC,SOME CONTENT,XY2,SOME CONTENT,,#AB DEF,,,,,,,,,,,0.1,,
ABC,SOME CONTENT,XY3,SOME CONTENT,,#in some country,,,,,,,,,,,0,,

The above assumes the *s in your input/output were added for emphasis and don't really exist in your data.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I tried to replicate what you have given, Thanks, for the solution without using loop. I need some more help, it prints the output on the console along with modifying the file, is it possible not to have it displayed on the console or just the records only which have been modified to be displayed on the console – Dileep Sep 02 '20 at 10:25
  • 1
    @Dileep Please don't write additional requirements in comments. Comments are intended to write suggestions or to ask for clarification. The `awk` command doesn't modify the input files, it only prints to stdout. `awk` does not support in-place editing like `sed`'s option `-i`. If you really want to modify the original file `data.csv`, redirect the output to a temporary file and on success rename the output file to `data.csv`, e.g. `awk '...' input.txt data.csv > output.csv && mv output.csv data.csv` Modifying the input data has the drawback that repeating the command may have unwanted effects. – Bodo Sep 02 '20 at 10:50
  • @Bodo GNU awk supports "inplace" editing with `-i inplace` just like GNU sed supports it with `-i`. – Ed Morton Sep 02 '20 at 15:46
  • 1
    @EdMorton Thanks for the clarification. Unfortunately the OP didn't specify the exact OS and the available tools. – Bodo Sep 02 '20 at 15:49
  • @Bodo right but they ARE using GNU sed, you can tell by `sed -i '...'`, so we know they have GNU tools available. You're right about the general case of using a tmp file though, that's what all those "inplace" editing tools (except `ed`) do internally anyway. – Ed Morton Sep 02 '20 at 15:49
0

This might work for you (GNU sed):

sed 's#.*#/,[^,]*,&,/ba#' inputFile | sed -i -Ef - -e 'b;:a;s/#AB DEF[^,]*/#in some country/;s/0\.1/0/' file

Build a set of commands to filter file on the contents of inputFile and pipe them through to a second invocation of sed that manipulates the contents of file.

potong
  • 55,640
  • 6
  • 51
  • 83
-1

I got the answer after I posted the question, so posting the answer back

 while read -r line; do
      sed -i "/$line/s|#AB DEF,,,,,,,,,,,0.1,,|#in some country,,,,,,,,,,,0,,|g" data.csv
 done < input.txt
Dileep
  • 126
  • 1
  • 4
  • 14
  • 2
    See [why-is-using-a-shell-loop-to-process-text-considered-bad-practice](https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice) for **some** of the issues and [is-it-possible-to-escape-regex-metacharacters-reliably-with-sed](https://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed) for some others. – Ed Morton Sep 01 '20 at 11:24