I need to search a directory that has hundreds or thousands of files, each containing XML with one or more instances of a specific string (begin/end tag with data). I can get all the instances of the string by doing
grep -ho '<mytagname>..............<\/mytagname>' /home/xyzzy/mydata/*.XML > /home/mydata/tagvalues.txt
then a few sed commands to strip off the tags, so I wind up with a file just containing a list of values:
value001
value002
value003
(etc)
Ideally though, I'd like to have each line of the file to also include the filename so I can import into a database for analysis.
So my result would be something like this
fileAAA value001
fileAAA value002
fileAAA value003
fileBBB value004
Exact formatting of the above is flexible - could have spaces or other separator, it could even still include the begin/end tags.
The closest I've been able to get is with grep -o
fileAAA:value001
value002
value003
fileBBB:value004
A perl one-liner would seem ideal but I'm new enough to that, that I have no clue how to begin.