1

I have the following XML:

 <rss version="2.0"
    xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
    xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:wp="http://wordpress.org/export/1.2/"
>
<channel>    
     <item>
        <title>Lucy – Official trailer 2014 – Universal Pictures</title>
        <pubDate>Mon, 10 Jul 2017 13:13:05 +0000</pubDate>
        <description></description>
        <excerpt:encoded><![CDATA[]]></excerpt:encoded>
        <wp:post_id>5688</wp:post_id>
        <wp:post_date><![CDATA[2017-07-10 13:13:05]]></wp:post_date>
        <wp:post_date_gmt><![CDATA[2017-07-10 13:13:05]]></wp:post_date_gmt>
        <wp:comment_status><![CDATA[closed]]></wp:comment_status>
        <wp:ping_status><![CDATA[open]]></wp:ping_status>
        <wp:post_name><![CDATA[lucy-official-trailer-2014-universal-pictures]]></wp:post_name>
        <wp:status><![CDATA[publish]]></wp:status>
        <wp:post_parent>0</wp:post_parent>
        <wp:menu_order>0</wp:menu_order>
        <wp:post_type><![CDATA[post]]></wp:post_type>
        <wp:post_password><![CDATA[]]></wp:post_password>
        <wp:is_sticky>0</wp:is_sticky>
        <wp:postmeta>
            <wp:meta_key><![CDATA[jtheme_video_file]]></wp:meta_key>
            <wp:meta_value><![CDATA[]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[_post_like_count]]></wp:meta_key>
            <wp:meta_value><![CDATA[6]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[snap_isAutoPosted]]></wp:meta_key>
            <wp:meta_value><![CDATA[1]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[_snap_forceSURL]]></wp:meta_key>
            <wp:meta_value><![CDATA[2]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[snap_MYURL]]></wp:meta_key>
            <wp:meta_value><![CDATA[]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[snapEdIT]]></wp:meta_key>
            <wp:meta_value><![CDATA[1]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[_post_like_modified]]></wp:meta_key>
            <wp:meta_value><![CDATA[2017-07-13 19:58:16]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[_yst_prominent_words_version]]></wp:meta_key>
            <wp:meta_value><![CDATA[1]]></wp:meta_value>
        </wp:postmeta>
        <wp:postmeta>
            <wp:meta_key><![CDATA[jtheme_video_code]]></wp:meta_key>
            <wp:meta_value><![CDATA[<iframe width="1280" height="720" src="https://www.youtube.com/embed/bN7ksFEVO9U" frameborder="0" allowfullscreen></iframe>]]></wp:meta_value>
        </wp:postmeta>
    </item>
</channel>

By using xmlstarlet and XPath i would like to search for the wp:postmeta which have the tag wp:meta_value with the videoID bN7ksFEVO9U.

After locate the correct wp:metavalue should print out the title of this tag which is under the item

Thank you in advance

andregr_jp
  • 13
  • 6
  • The namespace prefixes `excerpt` and `wp` aren't bound, so your XML isn't [namespace well-formed](https://stackoverflow.com/a/25830482/317052). We need a well-formed XML before we can help you. – Daniel Haley Sep 30 '17 at 02:41
  • 1
    Hi daniel-haley, thanks for the advise. I added the top and beginning of the XML. The XML is about 300k lines but is just repeat the item tag for different content – andregr_jp Sep 30 '17 at 10:18

1 Answers1

1

What you'll need to do is bind the http://wordpress.org/export/1.2/ namespace to a prefixe (with -N), match the correct item (with -m) and print the value (with -v). You can also use -n to print a newline after the title.

Example...

==> xml sel -N wp=http://wordpress.org/export/1.2/ -t -m "/rss/channel/item[wp:postmeta[normalize-space(wp:meta_key)='jtheme_video_code' and contains(wp:meta_value,'/bN7ksFEVO9U\"')]]" -v "title" -n input.xml
Lucy – Official trailer 2014 – Universal Pictures
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
  • 1
    Thank you Daniel Haley, was exactly what i was looking for – andregr_jp Sep 30 '17 at 15:16
  • Hi @daniel i found some URLs which have ?rel= and ?list=, for example: src="https://www.youtube.com/embed/qMNW-5SV7Cc?rel=0" frameborder="0" allowfullscreen>]]> and src="https://www.youtube-nocookie.com/embed/VNZEtLfShMQ?list=PLVfin74Qx3tVORX9DPW1jaxtVfFTsFs6U" frameborder="0" allowfullscreen>]]>. I tried this but didn't work: xmlstarlet sel -N wp=http://wordpress.org/export/1.2/ -t -m "/rss/channel/item[wp:postmeta[normalize-space(wp:meta_key)='jtheme_video_code' and contains(wp:meta_value,'/3Sy7RofBmrs\?rel=0')]]" -v "title" -n file.xml – andregr_jp Sep 30 '17 at 19:27