1

I have html code, which I need to parse on the fly. I need to find exact divs there, which all have id of "content-text-" and then 6 numbers (like "content-text-123456"), which I don't know beforehand. Is there any way to "substitute" the numbers at the end of the string I'm searching for (like "content-text-######")? Searching for "content-text-" does not work.

I'm doing this project on Windows Phone 8.1 with C# if it matters.

EDIT:

WPPageResponse response = JsonConvert.DeserializeObject<WPPageResponse>(json);

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(response.content);

foreach (var node in doc.DocumentNode.Descendants("div").Where(div => div.GetAttributeValue("id", "") == "content-text-######"))
            {
                // Gather data what it returns
            }

Here is some code if it helps. It works if I know the numbers and search with them, but the thing is that I can't know all the numbers there.

Tontsasd
  • 33
  • 1
  • 7
  • 1
    Why does searching for `"content-text-"` not work? Or do you mean that it doesn't help, since you need the numbers? – Rob Dec 17 '15 at 02:51
  • 2
    Use a regular expression [Regex.Replace](https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace(v=vs.110).aspx) – Quantumplate Dec 17 '15 at 02:52
  • 2
    Show your code so we can help fix it - it will also clarify your intent here. – Mark Schultheiss Dec 17 '15 at 03:03
  • Yes I need the numbers, searching without them just returns nothing. – Tontsasd Dec 17 '15 at 03:04
  • Don't parse HTML or XML yourself. Use a DOM parser, which will do it all for you properly. This has been said here about a million times before, which a search for *parse HTML* will find for you. – Ken White Dec 17 '15 at 03:11
  • Instead of: div.GetAttributeValue("id", "") == "content-text-######", why don't you use startswith - div.GetAttributeValue("id", "").StartsWith("content-text-######") – thorkia Dec 17 '15 at 03:18
  • Well, that is pretty good point there. I have no idea why I tried to use "==" there instead of StartsWith. Thanks a lot for covering my stupidness. – Tontsasd Dec 17 '15 at 03:24
  • I'd consider how widespread this parsing is going to get in your application, and who else is going to work with this code. If it's a one off or in throwaway code, then StartsWith(), int.TryParse() may be just dandy. If you're doing a lot of parsing, and especially in production code, a number of people would agree with [Ken White's comment](http://stackoverflow.com/questions/34325814/is-there-any-way-to-substitute-numbers-in-string-c#comment56393858_34325814) above. – El Zorko Dec 17 '15 at 03:32
  • @ElZorko: Thanks for finding [the post I wanted](http://stackoverflow.com/a/1732454/62576) for me. Got a phone call that delayed my search for it. – Ken White Dec 17 '15 at 03:41
  • @KenWhite Great minds, but really you make a good point. It's easy to overlook the drawbacks to the codebase in the long term when taking the seemingly obvious path. – El Zorko Dec 17 '15 at 03:54

1 Answers1

-2

You can use Regex for this.

            string data = "MyTest = 5564327";
            string output = Regex.Replace(data, @"\d", "#");
            Console.WriteLine(output);
            Console.Read();

Output is:

MyTest = #######
Krythic
  • 4,184
  • 5
  • 26
  • 67
  • 1
    Do **not** advise people to use regexes to parse HTML or XML. Trying to parse HTML yourself instead of using a DOM parser is bad enough; don't make it twice as bad by adding regular expressions to the problem. Search this site for *parse HTML regular expressions* and read the post with the most votes. – Ken White Dec 17 '15 at 03:27
  • 1
    No downvote here, but depending on how [complex your application is going to get](http://blog.codinghorror.com/parsing-html-the-cthulhu-way/), this approach may (or may not) result in [madness](http://stackoverflow.com/a/1732454/58063). – El Zorko Dec 17 '15 at 03:28