2

I have a LDAP query that I'm inputting to a script via stdin. I want to search for specific values, and possibly more than one, and then send the found value via stdout.

My LDAP query looks as such:

discover-repository-location=null, File Name=null, date-detected=Tue Jun11 12:44:14 UTC 2013, endpoint-machine-name=null, incident-id=545527, sender-ip=12.1.141.87, sender-email=WinNT://tmpdm/tmpcmp, Assigned To=null, sender-port=-null, endpoint-domain-name=null, Business Unit=null, endpoint-dos-volume-name=null, file-access-date=null, date-sent=Tue Jun 11 12:44:14 UTC 2013, endpoint-file-name=null, file-modified-by=null, Country=null, Manager Email=null, plugin-chain-id=1, discover-server=null, data-owner-name=null, Dismissal Reason=null, Last Name=null, First Name=null, Phone=null, subject=HTTP incident, Sender Email=null, UserID=null, endpoint-user-name=null, endpoint-volume-name=null, discover-name=null, discover-content-root-path=null, data-owner-email=null, file-create-date=null, endpoint-application-name=null, Employee Code=null, Region=null, Manager First Name=null, path=null, endpoint-application-path=null, Manager Last Name=null, Department=null, discover-location=null, protocol=HTTP, Resolution=null, file-owner=null, Postal Code=null, endpoint-file-path=null, Title=null, discover-extraction-date=null, Script-attribute=null, Manager Phone=null, file-created-by=null, file-owner-domain=nul

And say I want to pull out the protocol or the sender-email attributes from this query, which reads in as a single line. I can simply read it in by:

while read stdin line; do
     echo $line
done

Now I can check that these attributes exist, however I am having trouble grabbing the value that is in the key-value pair. I am trying to do so with regular expressions in bash. I'd like to grab the full value using '=' and ',' as the delimiters, and then possibly use regular expression to validate that I've grabbed a correct value from my attribute (as a safety check, and for logging purposes).

Any input would be useful and greatly appreciated.

signus
  • 1,118
  • 14
  • 43

3 Answers3

2

If you don't want to mess with awk (or friends) you can also do this in pure bash:

if [[ $query =~ protocol=\([^,]+\) ]] ; then
    protocol=${BASH_REMATCH[1]}
fi
if [[ $query =~ sender-email=\([^,]+\) ]] ; then
    sender_email=${BASH_REMATCH[1]}
fi

(assuming your entire query is in the $query variable).

Note also I have used a "_" instead of a "-" in the sender_email variable name.

Coincidentally I was not aware of the BASH_REMATCH array until last night, when I happened to need it too!

More info in the GNU Bash documentation.

Digital Trauma
  • 15,475
  • 3
  • 51
  • 83
  • 1
    I knew there was an array for those matches! I do think the regex you supplied is slightly incorrect, but this is what I was looking for. I use "sender-email\s*=([^,]*)" and I can retrieve the correct value that way. Definitely appreciated, thank you! – signus Sep 04 '13 at 19:20
  • Be careful when using quotes around the regex itself. The behaviour appears to be version-dependent. http://stackoverflow.com/questions/218156/bash-regex-with-quotes – Digital Trauma Sep 04 '13 at 20:26
  • I'm not using them in quotes, I was just quoting my regex for the sake of the comment, but definitely a good pointer for most people who have a problem checking regexes. – signus Sep 04 '13 at 20:32
1

Pure bash solution:

data='discover-repository-location=null, File Name=null, date-detected=Tue Jun11 12:44:14 UTC 2013, endpoint-machine-name=null, incident-id=545527, sender-ip=12.1.141.87, sender-email=WinNT://tmpdm/tmpcmp, Assigned To=null, sender-port=-null, endpoint-domain-name=null, Business Unit=null, endpoint-dos-volume-name=null, file-access-date=null, date-sent=Tue Jun 11 12:44:14 UTC 2013, endpoint-file-name=null, file-modified-by=null, Country=null, Manager Email=null, plugin-chain-id=1, discover-server=null, data-owner-name=null, Dismissal Reason=null, Last Name=null, First Name=null, Phone=null, subject=HTTP incident, Sender Email=null, UserID=null, endpoint-user-name=null, endpoint-volume-name=null, discover-name=null, discover-content-root-path=null, data-owner-email=null, file-create-date=null, endpoint-application-name=null, Employee Code=null, Region=null, Manager First Name=null, path=null, endpoint-application-path=null, Manager Last Name=null, Department=null, discover-location=null, protocol=HTTP, Resolution=null, file-owner=null, Postal Code=null, endpoint-file-path=null, Title=null, discover-extraction-date=null, Script-attribute=null, Manager Phone=null, file-created-by=null, file-owner-domain=nul'

declare -A allValues

while read -s -d ',' line; do
    IFS='=' read key value <<< "${line}"
    allValues["$key"]=$value
done <<< "$data,"

echo "${allValues['protocol']}" # prints HTTP
echo "${allValues['sender-email']}" # prints WinNT://tmpdm/tmpcmp

This way you can get any field you want. Of course it will freak out if you have , or = characters inside variables.

Aleks-Daniel Jakimenko-A.
  • 10,335
  • 3
  • 41
  • 39
  • I do like this approach, however I try and use regular expressions when possible. I do appreciate it and this does generate the correct output, thank you! – signus Sep 04 '13 at 19:18
0

With awk:

 protocol=$(awk -F'=' '$1=="protocol"{print $2}' RS='[, ]+' <<< "$STR" )
 sender_email=$(awk -F'=' '$1=="sender-email "{print $2}' RS='[, ]+' <<< "$STR")

With Grep:

 protocol=$(grep -oP '(?<=protocol=).*?(?=, )' <<< "$STR")
 sender_email=$(grep -oP '(?<=sender-email=).*?(?=, )' <<< "$STR")
user000001
  • 32,226
  • 12
  • 81
  • 108