0

I haveusers.txt as input file like below format.

rajesh.kumar@company.com
rhmn@company.com
mkkumar@company.com
manish.panday@company.com
daniel.m@company.com
jain@company.com
abul@company.com
aditi@company.com
aditya.s@company.com

Below script to read the users.txt file line by line and write only the matching user mail address along with row into the csv file.

#!/bin/bash

Input_file="users.txt"
Output_file="result.csv"
Match_list="$(command_to_get_match_user)"
Email="$(echo "$Match_list" | awk -F '|' '{print $4}' | tr -d '[:space:]')"

while read line; do
  if [[ "$line" == *"$Email"* ]]; then
    echo "$line" >> "$Output_file"
  fi
done < "$Input_file"

My command Match_list= in the above script provides the below output during the run time.

1320 | | Rajesh Kumar | rajesh.kumar@company.com | live
1584 | | A.K.M. Rahman | rhmn@company.com | live
1503 | | Mukesh Kumar | mkkumar@company.com | live
1279 | | Aayush Jain | aayush.jain@company.com | live
1597 | | Abul Hasan Md Osama | abul.osama@company.com | live
1660 | | Aditi Singpuri | aditi.singpuri@company.com | live
1570 | | Aditya Jain | aditya.jain@company.com | live

Currently above script is not writing the matched result in to file.

What is wrong with my code?

user4948798
  • 1,924
  • 4
  • 43
  • 89

5 Answers5

4

See why-is-using-a-shell-loop-to-process-text-considered-bad-practice.

Given this input:

$ head -100 users.txt data.txt
==> users.txt <==
rajesh.kumar@company.com
rhmn@company.com
mkkumar@company.com
manish.panday@company.com
daniel.m@company.com
jain@company.com
abul@company.com
aditi@company.com
aditya.s@company.com

==> data.txt <==
1320 | | Rajesh Kumar | rajesh.kumar@company.com | live
1584 | | A.K.M. Rahman | rhmn@company.com | live
1503 | | Mukesh Kumar | mkkumar@company.com | live
1279 | | Aayush Jain | aayush.jain@company.com | live
1597 | | Abul Hasan Md Osama | abul.osama@company.com | live
1660 | | Aditi Singpuri | aditi.singpuri@company.com | live
1570 | | Aditya Jain | aditya.jain@company.com | live

This seems to be what you're trying to do (using any awk):

# cat tst.sh
#!/usr/bin/env bash

command_to_get_match_user() { cat data.txt; }

Input_file="users.txt"
Output_file="result.csv"

command_to_get_match_user |
awk '
    { sub(/\r$/,"") }
    NR == FNR {
        emails[$1]
        next
    }
    $4 in emails
' "$Input_file" FS=' *[|] *' - > "$Output_file"
$ ./tst.sh
$ cat result.csv
1320 | | Rajesh Kumar | rajesh.kumar@company.com | live
1584 | | A.K.M. Rahman | rhmn@company.com | live
1503 | | Mukesh Kumar | mkkumar@company.com | live

The main difference between piping to the above awk script and @Frvadona's grep solution is that this won't produce false matches if a target email address shows up in some other field in the input and it'll work on every Unix box, not just those with a grep that supports -w (GNU grep?). This script also would continue to work if users.txt contained a blank line while the pipe to grep would then print all input lines.

Given you mentioned not getting output when it looks like you should, there may be white space and/or carriage returns at the end of the lines in your emails list which is why I'm populating the array with $1 instead of $0 and have a sub() removing terminating CRs if present.

I'm not suggesting you redefine command_to_get_match_user to be what I show - I just had to have some definition for it to test with.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
1

You might get some false positives but given the patterns and input files it should be accurate enough:

command_to_get_match_user | grep -wFf users.txt > result.csv

note: grep -w isn't standard but it is widely supported: GNU, BSD, Solaris (non-POSIX), AIX, HP-UX, etc...

Fravadona
  • 13,917
  • 1
  • 23
  • 35
1
command_to_get_match_user(){
cat << EOF
1320 | | Rajesh Kumar | rajesh.kumar@company.com | live
1584 | | A.K.M. Rahman | rhmn@company.com | live
1503 | | Mukesh Kumar | mkkumar@company.com | live
1279 | | Aayush Jain | aayush.jain@company.com | live
1597 | | Abul Hasan Md Osama | abul.osama@company.com | live
1660 | | Aditi Singpuri | aditi.singpuri@company.com | live
1570 | | Aditya Jain | aditya.jain@company.com | live
EOF
}

Output from users.txt

awk -F'|' '
    NR==FNR{
        gsub(/^ *| *$/,"",$4)
        map[$4]
        next
    }
    ($0 in map)
' <(command_to_get_match_user) users.txt

rajesh.kumar@company.com
rhmn@company.com
mkkumar@company.com

OR from command_to_get_match_user

awk -F' *[|] * ' '
    NR==FNR{
        map[$0]
        next
    }
    ($4 in map)
'  users.txt <(command_to_get_match_user)


1320 | | Rajesh Kumar | rajesh.kumar@company.com | live
1584 | | A.K.M. Rahman | rhmn@company.com | live
1503 | | Mukesh Kumar | mkkumar@company.com | live
ufopilot
  • 3,269
  • 2
  • 10
  • 12
1

With sed to extract email addresses from command_to_get_match_user, and grep to filter users.txt:

$ grep -Ff <(command_to_get_match_user |
sed -E 's/[[:space:]]*//g;s/^([^|]*\|){3}([^|]*).*/\2/') "$Input_file" > "$Output_file"

$ cat "$Output_file"
rajesh.kumar@company.com
rhmn@company.com
mkkumar@company.com

With GNU sed:

$ grep -Ff <(command_to_get_match_user |
sed -E 's/\s*//g;s/^([^|]*\|){3}([^|]*).*/\2/') "$Input_file" > "$Output_file"
Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51
0

This is not the cleanest solution but it works

#!/bin/bash

Input_file="users.txt"
Output_file="result.csv"

while read line; do
   echo "$(command_to_get_match_user)" | grep $line | awk 'BEGIN { FS = "|" } ; {print $1,$3,$4}' | sed 's/   /,/g' >> $Output_file
done < $Input_file

Earlier you said you wanted the output as csv, but it looks like you edited your post to remove that. Should be easy enough to edit the code to get the output in any format you'd like

alieb
  • 85
  • 4
  • It was mistakenly updated. basically i need the output to be in `csv` file. – user4948798 Mar 20 '23 at 10:34
  • @user4948798 the code i posted should do that then. – alieb Mar 20 '23 at 10:37
  • 1
    That'll do different things depending on the content of and/or the name of the input files, the directory you run it from, environment settings, etc. and it'll run orders of magnitude slower than, say, an awk script that doesn't have those issues. You could run it through http://shellcheck.net to identify and fix some, but not all, of those issues (see [why-is-using-a-shell-loop-to-process-text-considered-bad-practice](https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice) for more on that). – Ed Morton Mar 20 '23 at 11:15