0

I have a log file that I need to parse to get three values: RSSUrl, RSSCategory, And Url val, I can get each of these values individually but I can't figure out how to get all three together so I have the context of each.

Here is what the basic format of the file is:

    <key id="1" goodness="0" softCached="false" hits="0" creationMillis="1327941760709"       creationMillisAgo="-978" lastHitMillisAgo="INF" size="0" numRows="30" cache_type="L2" limit="1" type="data">
    <filters>
        <filter attr="Community/RSSCategory" value="Jeep"/>
            <filter attr="Community/RSSUrl" value="http://blogs.int.automotive.com/getrequest.php?url=http://blogs.automotive.com/"/>
        <filter attr="Community/NamespaceLookupCommunity"/>
        <filter attr="Krang/NamespaceLookupKrang"/>
    </filters>
    <params>
        <param name="CacheLifeSeconds" value="300"/>
        <param name="LIMIT" value="1"/>
        <param name="ReturnColumns" value="Title,Url,PublishDate,Description,ImageUrl"/>
        <param name="START" value="0"/>
    </params>
    <returns>
        <return attr="Community/RSSResult"/>
    </returns>
    <orders>
        <order attr="Krang/PublishDate" type="DESC"/>
    </orders>
    <keyString>
        [[data,filters=[Community/RSSUrl,Community/NamespaceLookupCommunity,Krang/NamespaceLookupKrang],params=[LIMIT,START],return=[Community/RSSResult],order=[Krang/PublishDate-]],start=0,limit=1]
    </keyString>
</key>
<keyend id="1" nowMillis="1327941760713" queryTimeNanos="115132">
<cached type="L1"/><CallContext>    <ServerName val="WEB-059" />
    <ServerId val="ȯ" />
    <PageName val="Default+%2F+Default" />
    <ClientIp val="10.1.12.111" />
    <Url val="http%3A%2F%2Fwww.automobilemag.com%2Findex.html" />
</CallContext></keyend>

I tried this grep -E '<filter attr=' rssurl.txt |grep -E '<Url val' rssurl.txt

But it doesn't bring everything back together. Any thoughts?

pamozer
  • 1
  • 2

2 Answers2

0
grep -E '\<filter attr\=\"Community\/RSSUrl|\<filter attr\=\"Community\/RSSCategory|\<Url val' a
jon
  • 5,986
  • 5
  • 28
  • 35
0

Note that regular expressions are not good at parsing XML. Use an XML parser instead.

Community
  • 1
  • 1
ghoti
  • 45,319
  • 8
  • 65
  • 104