0

I'm making backend in c# with ASP.Net Core. When the client sends HTTP POST I want to start async task that can run for days so I just want to Fire and Forget, I'm doing it that whit Task.Factory.StartNew(), but for some reason every time it runs it just stops randomly, sometimes after several hours sometimes after 30 seconds. You still can create another HTTP POST and start the async but it just won't complete.

        [HttpPost]
        public async Task<ActionResult<ScrapeCreateDto>> CreateScraper(ScrapeCreateDto scrape)
        {
            Scrape scrapeModel = new Scrape();
            scrapeModel = (Scrape)_mapper.Map(scrape, scrapeModel, typeof(ScrapeCreateDto), typeof(Scrape));

            if (!await _validation.ValidateURL(scrapeModel.Url)) return NotFound();
            Console.WriteLine($"Going to play {scrapeModel.Url} {scrapeModel.Views} times");

            var SoundCloudUrl = new ScrapeUrl();
            Task ScrapeSoundCloud = SoundCloudUrl.ScrapeHtml(scrapeModel);
            Task.Factory.StartNew(() => ScrapeSoundCloud); // Fire and forget -- it's now stopping after something about 170 mins

            return Ok();
        }
public class ScrapeUrl{
public async Task ScrapeHtml(Scrape scrape)
        {
            var scrapeSoundCloud = await ScrapeSoundCloud(scrape);
        }

        private async Task<string> ScrapeSoundCloud(Scrape scrape)
        {
            string fullUrl = scrape.Url;

            int prevViews = 0;

            var options = new LaunchOptions()
            {
                Headless = true,
                ExecutablePath = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe",
                Args = new string[] { "--incognito"}
            };

            for (int i = 0; i < scrape.Views; i++)
            {
                using (var browser = await Puppeteer.LaunchAsync(options, null, PuppeteerSharp.Product.Chrome))
                {
                    using (var page = await browser.NewPageAsync())
                    { 
                        // this is taking about half a minute but it runs thousand of times
                    }
                }

                await Task.Delay(3000);
            }
            Console.WriteLine("Done");
            return "Done";
        }
}

I know that right now it's uselless to call ScrapeSoundCloud from another task but later I want to start more things from ScrapeHtml. Also I know that with async there can be problem with errors bacause it sometimes isn't throwing exceptions, but I was printing to console, getting errors and everything, so it didn't throw any error and just stopped.

EDIT Answers are in comments. And this has been related to similar threads.

  • 2
    But you said you “want to fire and forget” it so why do you care if it stops? The whole idea of the “forget” part is that you don’t care what happens.. if you do care, then don’t forget it.. probably some exception is killing it but you never get to find out because you “forgot” it. Choose what you want to actually do – Caius Jard Apr 06 '21 at 06:27
  • There's nothing random about this. You can't start a background job like this. As soon as the request is finished all the objects and services created during its processing are disposed. There's no reason to use `Task.Factory.StartNew(` instead of `Task.Run` too. Besides, `ScrapeHtml` is *already* running. It makes no sense to call `Task.Run()` on an already running task – Panagiotis Kanavos Apr 06 '21 at 06:28
  • You have no Exception handling and no logging in that loop so when it fails you have no idea why. – Ian Mercer Apr 06 '21 at 06:28
  • The line `Task.Factory.StartNew(() => ScrapeSoundCloud);` is meaningless. `ScrapeSoundCloud` is an already active task. Tasks aren't threads, they don't need to be "started". This code would run if you used `await SoundCloudUrl.ScrapeHtml(scrapeModel)` to await the already running task and prevent the action from terminating. To run `ScrapeHtml` in the background though, you need a [BackgroundService](https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-5.0&tabs=visual-studio) – Panagiotis Kanavos Apr 06 '21 at 06:31
  • Further to previous comments, probably what you want to do is create some sort of "Processor" object, which can initialize as "Singleton" scope during service start-up, then inject this object into your controller. The "Processor" class replaces your current "SoundCloudUrl" class - so this at least avoids the object being disposed at the end of the API call problem. You also need to change the code so that the task is not started until your controller says for it to start. But also add logging and exception handling to find out what is really going on as previously suggested. – Ozraptor Apr 06 '21 at 06:34
  • @Ozraptor that's what the `BackgroundService` class does. A simple Singleton won't work. The BackgroundService infrastructure tells the web server that a long running process is used and receives notification when the web app is terminated or the app pool recycled, allowing jobs to exit gracefully – Panagiotis Kanavos Apr 06 '21 at 06:35
  • @Ian Mercer I am actually logging in that loop but that didn't cause the problem cause I saw every error with page not loading and things like that. – Jakub michalenko Apr 06 '21 at 06:38
  • @Jakubmichalenko there's no way you can fix this code. You simply can't create a background job by ignoring `await`. The *web server itself* will dispose all created objects once the request is processed. You need to use a `BackgroundService` – Panagiotis Kanavos Apr 06 '21 at 06:40
  • @PanagiotisKanavos - yes agreed. I would make the injected singleton scoped class implement the IHostedService interface etc such that can run as a background service. – Ozraptor Apr 06 '21 at 06:55
  • Please check out [my blog series on asynchronous messaging](https://blog.stephencleary.com/2021/01/asynchronous-messaging-1-basic-distributed-architecture.html) for a proper solution for request-extrinsic code. As noted in the first comment, "fire and forget" does indeed mean you can lose work; that is completely normal behavior, and that is why "fire and forget" is almost always the wrong solution for request-extrinsic code. – Stephen Cleary Apr 06 '21 at 11:17
  • @Stephen Cleary So you're saying that I shoul'd create a queue in some database and have another backend application running that would read from it and do the wanted tasks, but is it that more efficient from just creating backgroundService that would read from the queue? And also if I would do it with the queue in the backgroundService could i still run multiple queued tasks at once? – Jakub michalenko Apr 06 '21 at 18:33
  • @Jakubmichalenko: Yes, that is what I'm recommending. Efficiency is of less concern than correctness. You can run multiple queued tasks at once. – Stephen Cleary Apr 06 '21 at 19:03
  • @Stephen Cleary: Thank you – Jakub michalenko Apr 06 '21 at 19:05
  • @Stephen Cleary: Ok I did a lots of things since then, but it just stopped everytime. I did also the queue (for now just a background task) and even if it was awaiting the task it just stopped. – Jakub michalenko Apr 09 '21 at 17:01
  • @Jakubmichalenko: Since your code is presumably much changed, try posting a new question. – Stephen Cleary Apr 09 '21 at 22:40

0 Answers0