-1

Input csv - new_param.csv

value like -

ID
Identity
as-uid
cp_cus_id
evs
k_n

master.csv has value like -

A, xyz, id, abc
n, xyz, as-uid, abc, B, xyz, ne, abc
q, xyz, id evs, abc
3, xyz, k_n, abc, C, xyz, ad, abc
1, xyz, zd, abc
z, xyz, ID, abc

Require Output Updated new_param.csv - true or false in 2nd column

ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true

tried below code no output -

#!/bin/bash

declare -a keywords=(`cat new_param.csv`)
 
length=${#keywords[@]}

for (( j=0; j<length; j++ ));
do
 a= LC_ALL=C awk -v kw="${keywords[$j]}" -F, '{for (i=1;i<=NF;i++) if ($i ~ kw) {print i}}' master.csv
b=0
if [ $a -gt $b ]
then
  echo true $2 >> new_param.csv
else
  echo false $2 >> new_param.csv
fi
done

Please help someone !

Tried above mention code but does not helping me

getings error like -

test.sh: line 29: [: -gt: unary operator expected test.sh: line 33: -f2: command not found

  • please help anyone :pray: – ASHOK DAUKIYA Jan 31 '23 at 16:06
  • 1
    cut-n-paste your code into [shellcheck.net](https://www.shellcheck.net/) and make the recommended code changes; in this particular case variable `a` is empty (`a= `) and because of missing double quotes `[ $a -gt $b ]` becomes `[ -gt $b ]` hence the error message – markp-fuso Jan 31 '23 at 16:07
  • @markp-fuso my bad it's sample data, updated !! – ASHOK DAUKIYA Jan 31 '23 at 16:13
  • @markp-fuso if I add double quotes used like `if [ "$a" -gt "$b" ]` then get error `test.sh: line 29: [: : integer expression expected` – ASHOK DAUKIYA Jan 31 '23 at 16:15
  • line numbers from the error messages don't match the code you've posted, and we're not shown the script invocation so we don't know what `$2` contains ... so it's hard to tell what the 2nd error message (re: `-f2`) is referring to; consider updating the question with the code that generated the error messages, or update the error message(s) based on running the code you've supplied here; also provide the value of `$2` – markp-fuso Jan 31 '23 at 16:17
  • yes, adding the double quotes generates a new error message ... which is (still) tied to the fact that `a` is empty; assuming `a` is supposed to contain the output from the `awk` call ... to try `a=$(LC_ALL=C awk ...)` – markp-fuso Jan 31 '23 at 16:18
  • @markp-fuso tired `a=$(LC_ALL=C awk -v kw="${keywords[$j]}" -F, '{for (i=1;i<=NF;i++) if ($i ~ kw) {print i}}' master.csv) b=0 if [ "$a" -gt "$b" ]` error is test.sh: line 29: [: : integer expression expected – ASHOK DAUKIYA Jan 31 '23 at 16:30
  • and ... what's in `a`? what is the output from `typeset -p a`? does `a` contain what you're expecting it to contain? – markp-fuso Jan 31 '23 at 16:32
  • @markp-fuso I want to use `a` command to check keyword exist in csv or not but for now I am using this command to find column no of csv that has value contains with passing keyword it give me output like `36 36 46 36` – ASHOK DAUKIYA Jan 31 '23 at 16:38
  • @markp-fuso in case if keyword not exist in csv then it me blank no out type – ASHOK DAUKIYA Jan 31 '23 at 16:40
  • `-gt` is a numeric comparison operator; *you* need to insure both `a` and `b` contain numbers (actually, integers since `bash` only works with integers); if `a` and/or `b` are empty or non-integer then you will get an error; at this point *you* need to review your design (eg, do you want to compare integers or strings?) and then update your code to match your design requirements – markp-fuso Jan 31 '23 at 16:43
  • @markp-fuso can you suggest for any other solution for - check keyword exist in csv or not if exist then update or add 2nd column in input csv with any type of identifier like 0 or 1 and true or false – ASHOK DAUKIYA Jan 31 '23 at 16:52
  • should `evs` (from `master.csv1) match on the string `XevsY`? or are you looking for exact word matches? – markp-fuso Jan 31 '23 at 17:41
  • @markp-fuso exact word contains in column – ASHOK DAUKIYA Jan 31 '23 at 17:49
  • See [Are shell scripts sensitive to encoding and line endings?](https://stackoverflow.com/q/39527571/4154375) and [How to convert Windows end of line in Unix end of line (CR/LF to LF)](https://stackoverflow.com/q/3891076/4154375). – pjh Jan 31 '23 at 21:45

4 Answers4

2
awk -v RS=', |\n' 'NR == FNR { a[$0] = 1; next }
        { gsub(/,.*/, ""); b = "" b $0 (a[$0] ? ",true" : ",false") "\n" }
        END { if (FILENAME == "new_param.csv") printf "%s", b > FILENAME }' master.csv new_param.csv
konsolebox
  • 72,135
  • 12
  • 99
  • 105
2

Try this Shellcheck-clean pure Bash code:

#! /bin/bash -p

outputs=()

while read -r kw; do
    if grep -q -E "(^|[[:space:],])$kw([[:space:],]|\$)" master.csv; then
        outputs+=( "$kw,true" )
    else
        outputs+=( "$kw,false" )
    fi
done <new_param.csv

printf '%s\n' "${outputs[@]}" >new_param.csv
  • You may need to tweak the regular expression used with grep -E depending on what exactly you want to count as a match.
pjh
  • 6,388
  • 2
  • 16
  • 17
  • it return false for all keyword `ID FALSE Identity FALSE as-uid FALSE cp_cus_id FALSE` – ASHOK DAUKIYA Jan 31 '23 at 18:22
  • 3
    I tested the code with the files that you specified before I posted it. You are either not using the code that I provided or you are running it with different files. There is absolutely no way that the code could generate output with `FALSE` instead of `false` and no commas. – pjh Jan 31 '23 at 18:26
0

Using grep to find exact word matches:

$ grep -owf new_param.csv master.csv | sort -u
ID
as-uid
evs
k_n

Then feed this to awk to match against new_param.csv entries:

awk '
BEGIN   { OFS="," }
FNR==NR { a[$1]; next }
        { print $1, ($1 in a) ? "true" : "false" }
' <(grep -owf new_param.csv master.csv | sort -u) new_param.csv

This generates:

ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true

Once the results are confirmed as correct OP can add > new_param.csv to the end of the awk script, eg:

awk 'BEGIN { OFS="," } FNR==NR ....' <(grep -owf ...) new_parame.csv > new_param.csv
                                                                     ^^^^^^^^^^^^^^^
markp-fuso
  • 28,790
  • 4
  • 16
  • 36
  • `grep -owf new_param.csv master.csv | sort -u` it return only `evs` – ASHOK DAUKIYA Jan 31 '23 at 18:19
  • using the files you've provided ... that `grep|sort` returns 4 rows ... `ID`, `as-uid`, `evs` and `k_n`; I'm wondering if there are some non-printing characters in one/both of your files; consider reviewing the output from `head -2 new_param.csv master.csv | od -c` for any odd characters (eg, `\r`) – markp-fuso Jan 31 '23 at 18:25
  • `0000000 = = > n e w _ p a r a m . c s 0000020 v < = = \n I D \r \n I d e n t i 0000040 t y \r \n \n = = > R o p a . c s 0000060 v < = = \n A , x y z , i d 0000100 , a b c , , , , \r \n n , x y 0000120 z , a s - u i d , a b c , 0000140 B , x y z , n e , a b c \r 0000160 \n ` – ASHOK DAUKIYA Jan 31 '23 at 18:29
  • yeah, you've got windows/dos line endings (`\r`) in your files; try removing these characters via `dos2unix *.csv` – markp-fuso Jan 31 '23 at 18:40
  • this command not for mac os is it? `dos2unix *.csv` `zsh: command not found: dos2unix ` – ASHOK DAUKIYA Feb 01 '23 at 07:09
0

Alternative awk option:

Use a , for the field separator and concatenate the 3rd field for each record of the master.csv to the variable m. Second, read each record from the new-params.csv file and use the index funtion to determine whether that record exists in the m variable string.

 awk -F", " '
FNR==NR{m=m$3}
FNR<NR{print $0 (index(m,$0) ? ",true" : ",false")}                           
' master.csv new-params.csv

Output:

ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true
j_b
  • 1,975
  • 3
  • 8
  • 14