1

I spent hours racking my head as to why this isn't working

I'm trying to use ScrapySharp to scrape websites, right now just trying out sample sites then moving to my actual site.

Every time I do a form.Submit() in my program I get hit with a System.AggregateException (Specified Cast is Invalid)

My code:

using System;
using System.IO;
using System.Linq;
using System.Net;
using HtmlAgilityPack;
using ScrapySharp.Extensions;
using ScrapySharp.Html;
using ScrapySharp.Html.Forms;
using ScrapySharp.Network;

namespace WebScraper
{
    class MainClass
    {
        public static void Main(string[] args)
        {
            ScrapingBrowser browser = new ScrapingBrowser();

            //set UseDefaultCookiesParser as false if a website returns invalid cookies format
            //browser.UseDefaultCookiesParser = false;
            browser.AllowAutoRedirect = true;
            browser.AllowMetaRedirect = true;
            WebPage homePage = browser.NavigateToPage(new Uri("http://the-internet.herokuapp.com/login"));

            PageWebForm form = homePage.FindForm("login");
            form["username"] = "tomsmith";
            form["password"] = "SuperSecretPassword!";
            form.Method = HttpVerb.Get; //I tried both .Post and .Get
            WebPage resultsPage = form.Submit(); //THIS IS WHERE I GET THE ERROR
            Console.WriteLine(resultsPage);

        }
    }
}

My error:

System.AggregateException: One or more errors occurred. (Specified cast is not valid.) ---> System.InvalidCastException: Specified cast is not valid. at ScrapySharp.Network.ScrapingBrowser.CreateRequest (System.Uri url, ScrapySharp.Network.HttpVerb verb) [0x0000b] in <0a639adc663f45108f057c429262c620>:0 at ScrapySharp.Network.ScrapingBrowser.NavigateToPageAsync (System.Uri url, ScrapySharp.Network.HttpVerb verb, System.String data, System.String contentType) [0x00066] in <0a639adc663f45108f057c429262c620>:0 --- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional (System.Boolean includeTaskCanceledExceptions) [0x00011] in /Users/builder/jenkins/workspace/build-package-osx-mono/2019-06/external/bockbuild/builds/mono-x64/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs:2027 at System.Threading.Tasks.Task1[TResult].GetResultCore (System.Boolean waitCompletionNotification) [0x0002b] in /Users/builder/jenkins/workspace/build-package-osx-mono/2019-06/external/bockbuild/builds/mono-x64/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs:496 at System.Threading.Tasks.Task1[TResult].get_Result () [0x00000] in /Users/builder/jenkins/workspace/build-package-osx-mono/2019-06/external/bockbuild/builds/mono-x64/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs:466 at ScrapySharp.Network.ScrapingBrowser.NavigateToPage (System.Uri url, ScrapySharp.Network.HttpVerb verb, System.String data, System.String contentType) [0x0000b] in <0a639adc663f45108f057c429262c620>:0 at ScrapySharp.Html.Forms.PageWebForm.Submit () [0x00023] in <0a639adc663f45108f057c429262c620>:0 at WebScraper.MainClass.Main (System.String[] args) [0x00065] in /Users/arib/Projects/WebScraper/WebScraper/Program.cs:29

I'm so tired of this error, any and all help is much appreciated.. Thank you

arcanium0611
  • 123
  • 12
  • 1
    I ran your code with all different versions of ScarpySharp but works everytime. There are a lot of others with this bug that comes up for various reasons, incl. UnitTest projects. Try creating a new solution and see how things work. Include other projects one at a time to see what causes it. – Jawad Dec 28 '19 at 03:11
  • Thanks so much. I thought I was doing something wrong. The bug set me back a few days, I just scrapped it and started afresh without scrapy sharp (using httpclient from system.net). I'll try it again in a new project, this time on a different pc just to be safe. – arcanium0611 Dec 28 '19 at 17:39

1 Answers1

1

The problem was that when you use form["username"], the result is a string. You want to get the FormField, which you can do using this code:

WebPage homePage = browser.NavigateToPage(new Uri("http://the-internet.herokuapp.com/login"));
PageWebForm form = homePage.FindForm("login");
var formFields = form.FormFields;
foreach (var field in formFields)
{
    if (field.Name.Equals("username", StringComparison.OrdinalIgnoreCase))
    {
        field.Value = "tomsmith";

    }
    else if (field.Name.Equals("password", StringComparison.OrdinalIgnoreCase))
    {
        field.Value = "SuperSecretPassword!";

    }
}

WebPage resultsPage = form.Submit();
Console.WriteLine(resultsPage);

Alternatively, you could use Find() to get the FormField:

var usernameField = form.FormFields.Find(x => x.Name == "username");
Darren Griffith
  • 3,290
  • 3
  • 28
  • 35
  • Thanks so Much! I did not expect to get a solution on this months later, much appreciated. I'm going back and redoing this again using scrapysharp! – arcanium0611 Jul 31 '20 at 14:46
  • 1
    @arcanium0611, glad I could help! I just started using scrapysharp, and when I got it to work, I wanted to share for the next person who is getting started. – Darren Griffith Jul 31 '20 at 22:21