Questions tagged [scrapysharp]

ScrapySharp has a Web Client able to simulate a real Web browser wrapping the HtmlAgilityPack API.

19 questions
6
votes
1 answer

How to use ScrapySharp to parse elements in an html document?

Here's the project official "Documentation": https://bitbucket.org/rflechner/scrapysharp/wiki/Home No matter what I try, I can't find the CssSelect() method that the library is supposed to add to make querying things easier. Here's what I've…
sergserg
  • 21,716
  • 41
  • 129
  • 182
5
votes
3 answers

Keep getting stuck loading when ScrapySharp NavigateToPage

My browser just keeps loading when navigatetopage using scrapysharp and won't go to the next line of code. Below is my code using c# asp.net web form. May I know why? The link I use is working and can manually browse. The code just gets stuck at the…
Tiong Gor
  • 73
  • 1
  • 8
3
votes
2 answers

How to find the form in scrapysharp when it only has attributes i.e. no name or id

I am new to scrapySharp as well as web scraping. I am trying to scrape a site that is secured and has a login screen. The form element does not have a name/id attribute, thus making my life more complicated. I have been unable to figure out how…
2
votes
0 answers

Using scrapy sharp with login

I have created one simple windows app using c# windows forms and scrapysharp library.I have parsed data without problem. But now I need to parse data from another page which requires login. The problem is I dont know how to save cookies in this…
2
votes
1 answer

How do I get scrapysharp to work in a MVC web app?

I successfully have scrapysharp working in a console app. I created a new MVC web app in VS2013 with no authentication or anything else special. I used nuget to add ScrapySharp and then have this code in my Home Controller. I get no response to my…
bhs8227
  • 81
  • 9
1
vote
0 answers

System.invalidoperationexception sequence contains no elements Web Scraper

I'm trying to create a Web Scraper with ScrapySharp. I encountered the following error - System.invalidoperationexception sequence contains no elements Code - Scraper class: static void Main(string[] args) { …
1
vote
1 answer

ScrapySharp Form Submit causing System.AggregateException

I spent hours racking my head as to why this isn't working I'm trying to use ScrapySharp to scrape websites, right now just trying out sample sites then moving to my actual site. Every time I do a form.Submit() in my program I get hit with a…
1
vote
1 answer

C# ScrapySharp 'System.Net.CookieException: 'The 'Name'='HttpOnly, NID' part of the cookie is invalid.'

So i'm facing an unexpected issue with my code. For some reason, I am unable to download & print the links out of my Google search... Help is much appreciated as I'm really not sure what is going on here... I am also using the DotNET SDK using…
1
vote
1 answer

C# scrape correct web content following jquery

I've been using HtmlAgilityPack for awhile but the web resource I have been working with now has a (seems like) jQuery protocol the browser passes through. What I expect to load is a product page but what actually loads (verified by a WebBrowser…
Xero Phane
  • 88
  • 8
1
vote
1 answer

Click on HTML elements with Scrapy (WebScraping)

I'm doing a program in c # using scrapySharp or HtmlAgilityPack. But I have the disadvantage of that part of the information that I need, to appear when I click on an HTML element (Button, link ). In some forums it was commented that when using…
1
vote
2 answers

ScrapySharp causes Windows Form to freeze without exception

When included in my code ScrapingBrowser browser = new ScrapingBrowser(); WebPage testPage = browser.NavigateToPage(new Uri("https://www.google.co.uk/")); causes the Windows Form to stop working (once this line is reached in execution, the form…
KangarooChief
  • 381
  • 3
  • 14
0
votes
0 answers

C# - ScrapySharp - how to get the from the <head>?</a></h3> <div class="excerpt">I want to get the title from a webpage through ScrapySharp. With CssSelect, I'm only getting the <body> tag. I am using it like: string SearchQuery = PageResult.Html.CssSelect(".breadcrump-summary").First().InnerText; </div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/c#" class="post-tag grid--cell" title="show questions tagged 'c#'" rel="tag">c#</a> <a href="../../questions/tagged/web-scraping" class="post-tag grid--cell" title="show questions tagged 'web-scraping'" rel="tag">web-scraping</a> <a href="../../questions/tagged/scrapysharp" class="post-tag grid--cell" title="show questions tagged 'scrapysharp'" rel="tag">scrapysharp</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jul 30 '20 at 19:10">asked Jul 30 '20 at 19:10</time> <a href="../../users/13968197/jonathan-kopka" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/13968197.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Jonathan Kopka" /> </a> <div class="s-user-card--info"> <a href="../../users/13968197/jonathan-kopka" class="s-user-card--link">Jonathan Kopka</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">1</li> <li class="s-award-bling s-award-bling__bronze" title="2 bronze badges">2</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-58804035"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>0</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>1</strong> answer </div> </div> </div> <div class="summary"> <h3><a href="../../questions/58804035/mocking-scrapysharp-response-for-unit-test" class="question-hyperlink">Mocking ScrapySharp response for unit test</a></h3> <div class="excerpt">I'm using ScrapySharp in my clean architecture solution and I need to mock a Scraping service response in my unit tests so that the unit test is self contained and not actually hitting any external server. I've looked at using Moq but don't see a…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/c#" class="post-tag grid--cell" title="show questions tagged 'c#'" rel="tag">c#</a> <a href="../../questions/tagged/unit-testing" class="post-tag grid--cell" title="show questions tagged 'unit-testing'" rel="tag">unit-testing</a> <a href="../../questions/tagged/tdd" class="post-tag grid--cell" title="show questions tagged 'tdd'" rel="tag">tdd</a> <a href="../../questions/tagged/scrapysharp" class="post-tag grid--cell" title="show questions tagged 'scrapysharp'" rel="tag">scrapysharp</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Nov 11 '19 at 15:14">asked Nov 11 '19 at 15:14</time> <a href="../../users/12356046/colm" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/12356046.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Colm" /> </a> <div class="s-user-card--info"> <a href="../../users/12356046/colm" class="s-user-card--link">Colm</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">3</li> <li class="s-award-bling s-award-bling__bronze" title="1 bronze badges">1</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-58101061"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>0</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status "> <strong>1</strong> answer </div> </div> </div> <div class="summary"> <h3><a href="../../questions/58101061/scrape-a-table-using-scrapysharp-and-htmlagilitypack" class="question-hyperlink">Scrape a table using ScrapySharp and HtmlAgilityPack</a></h3> <div class="excerpt">I am trying to scrape an economic calendar from a specific website. Actually, I tried many times without any success, I don't know where I am wrong. Can you help me, pls? using System; using System.Collections.Generic; using System.Linq; using…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/scrapysharp" class="post-tag grid--cell" title="show questions tagged 'scrapysharp'" rel="tag">scrapysharp</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Sep 25 '19 at 14:44">asked Sep 25 '19 at 14:44</time> <a href="../../users/12119482/stagnoman" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/12119482.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Stagnoman" /> </a> <div class="s-user-card--info"> <a href="../../users/12119482/stagnoman" class="s-user-card--link">Stagnoman</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">1</li> <li class="s-award-bling s-award-bling__bronze" title="1 bronze badges">1</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-50853931"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>0</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>1</strong> answer </div> </div> </div> <div class="summary"> <h3><a href="../../questions/50853931/scraping-an-iframe-which-has-as-source-a-jsp" class="question-hyperlink">Scraping an IFrame which has as source a jsp</a></h3> <div class="excerpt">I'm new to webscraping and I have to do the following: 1. Go to a webpage 2. Find an element 3. Get its value Now I don't have a problem going to the webpage, that works fine. The problem is that the element I need, actually comes from a jsp which…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/c#" class="post-tag grid--cell" title="show questions tagged 'c#'" rel="tag">c#</a> <a href="../../questions/tagged/web-scraping" class="post-tag grid--cell" title="show questions tagged 'web-scraping'" rel="tag">web-scraping</a> <a href="../../questions/tagged/html-agility-pack" class="post-tag grid--cell" title="show questions tagged 'html-agility-pack'" rel="tag">html-agility-pack</a> <a href="../../questions/tagged/scrapysharp" class="post-tag grid--cell" title="show questions tagged 'scrapysharp'" rel="tag">scrapysharp</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jun 14 '18 at 09:19">asked Jun 14 '18 at 09:19</time> <a href="../../users/2889579/bart-schelkens" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/2889579.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Bart Schelkens" /> </a> <div class="s-user-card--info"> <a href="../../users/2889579/bart-schelkens" class="s-user-card--link">Bart Schelkens</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">1,235</li> <li class="s-award-bling s-award-bling__gold" title="4 gold badges">4</li> <li class="s-award-bling s-award-bling__silver" title="21 silver badges">21</li> <li class="s-award-bling s-award-bling__bronze" title="45 bronze badges">45</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="s-pagination pager fr"> <div class="s-pagination--item is-selected">1</div> <a class="s-pagination--item" href="../../questions/tagged/scrapysharp_page=2" rel="" title="Go to page 2">2</a> <a class="s-pagination--item" href="../../questions/tagged/scrapysharp_page=2" rel="next" title="Go to page 2"> Next</a> </div> </div> </div> </div> </div> <script src="../../static/js/stack-icons.js"></script> <script src="../../static/js/fromnow.js"></script> </body> </html>