0

I have a text which looks like this:

<a href="/track/867059" itemprop="url" class="evt-click" data-target="track">
                            <span itemprop="name">Feel So Good </span>
                        </a>
                        <span class="featuring" data-target="featuring"></span>
                    </div>
                </td>
                <td class="artist">
                    <div class="wrapper ellipsis">
                        <a class="evt-click" href="/artist/7" data-target="artist" itemprop="byArtist">Jamiroquai</a>
                    </div>
                </td>
                <td class="album">
                    <div class="wrapper ellipsis">
                        <a class="evt-click" href="/album/98952" itemprop="inAlbum" data-target="album" >A Funk Odyssey</a>
                    </div>
                </td>
                <td class="length">
                    <div class="wrapper" data-target="length"></div>
                </td>
                <td class="popularity" title="By popularity:7.85 / 10">
                    <span class="note" data-target="note"></span>
                </td>
                <td class="added">
                    <div class="wrapper ellipsis timestamp" data-target="added">
                        05:23

and I want to get the 05:23 at the end of the text. I tried these two patterns but they both failed.

(\d{2}:\d{2})$
data-target=\"added\">(.*?)$

What would the right pattern for this be?

user2530266
  • 287
  • 3
  • 18
  • Right pattern would be first reading [this](http://stackoverflow.com/a/1732454/932418) and then using [this](https://htmlagilitypack.codeplex.com/) – Eser Aug 29 '15 at 21:24
  • I know but I am on WP8.1 and most of the html parsers aren't compatible. + it is a file I have to handle so the source doesn't actually change. – user2530266 Aug 29 '15 at 21:28
  • `most of the html parsers aren't compatible` What about HtmlAgilityPack? Have you tried it? I would find it odd if it doesn't work with WP8.1 while it is supporting WP7 :) – Eser Aug 29 '15 at 21:32
  • I have already tried it. However Nuget references the incorrect assembly for WP8. + as I am telling It is a standard text which won't change. Why would I install htmlagilitypack for such a thing? – user2530266 Aug 29 '15 at 21:38
  • If it doesn't change then use string functions like IndexOf and Substring. Why do you need regex? BTW: http://stackoverflow.com/questions/25261194/htmlagilitypack-using-linq-for-windows-phone-8-1-platform or http://stackoverflow.com/questions/26698775/windows-phone-8-1-hubapp-htmlagilitypack – Eser Aug 29 '15 at 21:41
  • My guess is that you need a modifier so the `.` matches new lines as well as any character, possibly `s`, not familiar with `c#` though. A parser would also be your best bet.. https://msdn.microsoft.com/en-us/library/yd1hzczs(v=vs.110).aspx#Singleline – chris85 Aug 29 '15 at 21:56
  • 1
    Your first pattern matches just fine, so it must be a mistake in your code. – l'L'l Aug 29 '15 at 21:59
  • @l'L'l This is weird. MatchCollection would fail with that pattern while Match will succed. Thank you for making me think about twice ;) – user2530266 Aug 29 '15 at 22:24
  • MatchCollection still works fine with the first pattern; without seeing your code there's no telling where it's actually failing. – l'L'l Aug 29 '15 at 23:03

1 Answers1

-1

If you can use XML the code below works nicely

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string input =
                "<Root>" +
                    "<td class=\"artist\">" +
                      "<div class=\"wrapper ellipsis\">" +
                        "<a class=\"evt-click\" href=\"/artist/7\" data-target=\"artist\" itemprop=\"byArtist\">Jamiroquai</a>" +
                      "</div>" +
                    "</td>" +
                    "<td class=\"album\">" +
                      "<div class=\"wrapper ellipsis\">" +
                        "<a class=\"evt-click\" href=\"/album/98952\" itemprop=\"inAlbum\" data-target=\"album\" >A Funk Odyssey</a>" +
                      "</div>" +
                    "</td>" +
                    "<td class=\"length\">" +
                      "<div class=\"wrapper\" data-target=\"length\"></div>" +
                    "</td>" +
                    "<td class=\"popularity\" title=\"By popularity:7.85 / 10\">" +
                      "<span class=\"note\" data-target=\"note\"></span>" +
                    "</td>" +
                    "<td class=\"added\">" +
                      "<div class=\"wrapper ellipsis timestamp\" data-target=\"added\">" +
                        "05:23" +
                      "</div>" +
                    "</td>" +
                "</Root>";

            XElement doc = XElement.Parse(input);
            var results = doc.Descendants("div").Where(x => x.Attribute("class").Value == "wrapper ellipsis timestamp").FirstOrDefault().Value;
         }
    }
}
​
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • You can not use an xml parser to parse html. For example, `
    `, `
    `, `` are a valid html tags which don't require a closing tag.
    – L.B Aug 29 '15 at 22:35
  • Same as: post a random json text for example, and say *"If you can use JSON the code below works nicely"* And it is clear from question that OP can not use xml parsers, Otherwise you wouldn't fix the sampe html just to be able to parse. – L.B Aug 30 '15 at 07:06