68

I'm wondering if there's a jQuery-like css selector that can be used in C#.

Currently, I'm parsing some html strings using regex and thought it would be much nicer to have something like the css selector in jQuery to match my desired elements.

Keavon
  • 6,837
  • 9
  • 51
  • 79
Dave
  • 12,117
  • 10
  • 46
  • 52
  • So, I guess there's currently nothing like this – Dave Oct 16 '09 at 22:32
  • How does XPath querying not meet your needs? Load string into DOM object with XML or HTML parser, and query for elements based on whatever you like. Just like jQuery. – patjbs Oct 16 '09 at 23:00
  • If you desire an easier to grep query structure, have you tried using linq queries? – patjbs Oct 16 '09 at 23:03

5 Answers5

81

Update 10/18/2012

CsQuery is now in release 1.3. The latest release incorporates a C# port of the validator.nu HTML5 parser. As a result CsQuery will now produce a DOM that uses the HTML5 spec for invalid markup handling and is completely standards compliant.

Original Answer

Old question but new answer. I've recently released version 1.1 of CsQuery, a jQuery port for .NET 4 written in C# that I've been working on for about a year. Also on NuGet as "CsQuery"

The current release implements all CSS2 & CSS3 selectors, all jQuery extensions, and all jQuery DOM manipulation methods. It's got extensive test coverage including all the tests from jQuery and sizzle (the jQuery CSS selection engine). I've also included some performance tests for direct comparisons with Fizzler; for the most part CsQuery dramatically outperforms it. The exception is actually loading the HTML in the first place where Fizzler is faster; I assume this is because fizzler doesn't build an index. You get that time back after your first selection, though.

There's documentation on the github site, but at a basic level it works like this:

Create from a string of HTML

CQ dom = CQ.Create(htmlString);

Load synchronously from the web

CQ dom = CQ.CreateFromUrl("http://www.jquery.com");

Load asynchronously (non-blocking)

CQ.CreateFromUrlAsync("http://www.jquery.com", responseSuccess => {
    Dom = response.Dom;        
}, responseFail => {
    ..
});

Run selectors & do jQuery stuff

var childSpans = dom["div > span"];
childSpans.AddClass("myclass");

the CQ object is like thejQuery object. The property indexer used above is the default method (like $(...).

Output:

string html = dom.Render();
Jamie Treworgy
  • 23,934
  • 8
  • 76
  • 119
  • Do you handle cases where there are new-lines, line-breaks, and tabs as whitespace separating the class names? – casperOne Jun 19 '12 at 18:41
  • Just added a test for this, it already correctly interprets any whitespace in classes as a separator. So the answer is yes. – Jamie Treworgy Jun 19 '12 at 19:08
  • Thanks for the info. The question is unfortunately NC, but I've run into this specific issue a number of times. – casperOne Jun 19 '12 at 19:15
  • 3
    By the way, is there some reason why you are closing all the old questions that ask "is there a jquery port for c#" because I've answered it, nearly three years later, now that there is? Whether or not you agree that the question is a good one for SO, it's been here for years, and appears high in google searches for the question. I would like people to be able to find this. To close it now seems, well, a bit vindictive. The only consequence will be that this project, which is free, useful, and MIT licensed, and didn't exist in a complete form until recently, will have less exposure. – Jamie Treworgy Jun 19 '12 at 19:15
  • It's actually the duplicate answer flags that the system is bringing up. These questions are list questions and they are specifically not allowed on Stack Overflow. There was a time when these questions *were* ok for Stack Overflow, but that time is no longer, and we close these when we see them. That said, if you're going to repeat the same answer, and the question is really a duplicate, you should flag it for moderator attention to be closed as such. Otherwise, we will probably delete the answers if we find they are on all duplicate questions. – casperOne Jun 19 '12 at 19:22
  • 1
    Well, I guess it's your call, I think it's too bad that you are using the "letter of the law" to hinder my efforts to let people know about this project. I answered this less than a day ago and have gotten two upvotes already, so I guess people are finding it useful even as you are not. Too bad it will be gone from SO tomorrow. – Jamie Treworgy Jun 19 '12 at 19:26
  • I gave you one of the upvotes, so don't be so quick to judge. That said, Stack Overflow is not a place for promotion. It's [explicitly forbidden in the FAQ](http://stackoverflow.com/faq#promotion). This has been hashed out *many, many* times on [meta]. There is *no* leeway on this. – casperOne Jun 19 '12 at 19:30
  • It says "post good, relevant answers, and if some (but not all) happen to be about your product or website, so be it". I have answered *hundreds* of questions having nothing to do with my projects (not *products* or *websites*, anyway) over the years. I think my answer qualifies as "good and relevant" including example code and details, and I certainly disclosed my affiliation. This directive clearly seems oriented towards commercial interests. This is not one, and I am definitely not a spammer, which I think you can verify with a review of my history on SO. – Jamie Treworgy Jun 19 '12 at 19:35
  • The directive is not towards commercial interests, it's towards content that is repeated. It might be a relevant answer to the question, but you should judge whether or not it's a good question (which this clearly is not). That said, there are plenty of issues on [meta] which reference promotion as well as the NC closing of list questions. If you want to follow up there and get the community's take on it, I recommend that. – casperOne Jun 19 '12 at 19:41
  • CsQuery is no longer maintained. The author suggests to consider AngleSharp https://github.com/AngleSharp/AngleSharp – Jeroen K May 15 '17 at 14:57
72

You should definitely see @jamietre's CsQuery. Check out his answer to this question!

Fizzler and Sharp-Query provide similar functionality, but the projects seem to be abandoned.

Andy S
  • 8,641
  • 6
  • 36
  • 40
2

Not quite jQuery like, but this may help: http://www.codeplex.com/htmlagilitypack

Daniel
  • 3,021
  • 5
  • 35
  • 50
  • 2
    yes... I just looked over the html agility pack few days ago. But, it still uses XPath for matching. It's not that I don't like XPath. But, the cleanness of the css selector syntax is much better imo. – Dave Oct 16 '09 at 22:08
  • LINQ-to-Objects is probably what I'd use. But right - not as clean as selectors. – Daniel Oct 19 '09 at 20:54
1

For XML you might use XPath...

Frank Schwieterman
  • 24,142
  • 15
  • 92
  • 130
1

I'm not entirely clear as to what you're trying to achieve, but if you have a HTML document that you're trying to extract data from, I'd recommend loading it with a parser, and then it becomes fairly trivial to query the object to pull desired elements.

The parser I linked above allows for use of XPath queries, which sounds like what you are looking for.

Let me know if I've misunderstood.

Community
  • 1
  • 1
patjbs
  • 4,522
  • 3
  • 23
  • 18
  • May I know what parser you are refering to? I just want something like Doc.select("div.foo") to return all the elements that is a div and have class foo. – Dave Oct 16 '09 at 21:58
  • I added a link to the text, which points to a SO question about parsing HTML. In particular, the HTML Agility pack parser I've used in the past to load HTML docs and query against them with great success. – patjbs Oct 16 '09 at 22:00