I'm trying to extract three columns from a text file that looks like this:
<Record type="HKQuantityTypeIdentifierHeartRate" sourceName="Michael’s Apple Watch" sourceVersion="6.2.5" device="<<HKDevice: 0x2877dc870>, name:WHOOP 3A020013, manufacturer:WHOOP Inc., localIdentifier:80A56B86-0DEC-A6C3-7B22-077BD4BE4C8D>" unit="count/min" creationDate="2020-05-30 07:26:39 -0400" startDate="2020-05-30 07:26:39 -0400" endDate="2020-05-30 07:26:39 -0400" value="72">
<Record type="HKQuantityTypeIdentifierHeartRate" sourceName="Wahoo" sourceVersion="3135" unit="count/min" creationDate="2020-05-30 07:37:05 -0400" startDate="2020-05-30 07:35:46 -0400" endDate="2020-05-30 07:37:01 -0400" value="83"/>
This is the information I'd like to extract:
sourceName, creationDate, value
"Michael’s Apple Watch", "2020-05-30 07:26:39", "72"
"Wahoo", "2020-05-30 07:37:05", "83"
So I basically need the source name, full creationDate and value in a comma-separated format.
The issue I'm having is that sourceName itself has multiple nested "fields" and creationDate has spaces. So my previous attempts using grep and awk all failed :)
Any help would be greatly appreciated.