I have a data seperated with pipe "|" and I would like to parse it with awk and write it into a DB.
EndpointRequest|ID-ip-172-31-70-119-eu-west-1-compute-internal-209879772|2022-05-12 08:20:03:467|0|ip-172-31-70-119|616e50193233020648|vfgh|GenericAmount|61d458303574b21f|Display|v1|Display-v1|PrepaidEndpoint|6227300ec1786d26|Corporate|62273041c8cf901071786d81|Health Line||||69.28.67.153|Java/1.8.0_321|application/xml|468|475|POST||http://127.0.0.1/endpoint/||200||2022-05-12 08:20:03:458|0|468|7|0|0|0|true|Http|null|null|HTTPConnector:CallPrepaid|Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2\nAuthorization: Bearer e3edbb1d8f5d8c828dc584ed293602bf\nContent-Type: application/xml\nX-Amzn-Trace-Id: Root=1-627cc333-7167\nX-Forwarded-For: XX.XX.XX.XX\nX-Forwarded-Port: 443\nX-Forwarded-Proto: https\n\n<?xml version="1.0"?>\n<!DOCTYPE cp_request SYSTEM "cp_req_websvr.dtd">\n<cp_request>\n <cp_id>YY1880</cp_id>\n <cp_transaction_id>SDP</cp_transaction_id>\n <op_transaction_id>arr684754251</op_transaction_id>\n <application>1</application>\n <action>2</action>\n <user_id type="MSISDN">9999999999</user_id>\n <cp_timer>5</cp_timer>\n <transaction_price>1900</transaction_price>\n <transaction_currency>0</transaction_currency>\n</cp_request>
The data has many lines like the one above and I use the command below to get certain fields.
more file.log | egrep "EndpointRequest|EndpointSuccess|EndpointFailure" | egrep "PrepaidEndpoint" | awk -F"|" '{print $1"|"$2"|"$3"|"$4"|"$5"|"$12"|"$13"|"$15"|"$17"|"$21"|"$25"|"$30"|"$31"|"$32"|"$33"|"$44}'
The thing here is, on the last field (#44), there is an HTTP response that contains some headers and an XML payload. I need to get "op_transaction_id" value ("arr684754251") and add it to the end of the awk command, but am unable to do so. In a seperate command, I can get that value via "sed",
sed -n "s/.*<op_transaction_id>\(.*\)<\/op_transaction_id>.*/\1/p" file.log
How do I migrate the "sed" command into the "awk" command, so I can have "op_transaction_id" value as one of the fields in "awk".
Expected output:
EndpointRequest|ID-ip-172-31-70-119-eu-west-1-compute-internal-209879772|2022-05-12 08:20:03:467|0|ip-172-31-70-119|Display-v1|PrepaidEndpoint|Corporate|Health Line|69.28.67.153|475|200||2022-05-12 08:20:03:458|0|arr684754251
Thank you bash gurus. Any help is appreciated.