0

I'm writing a small batch fetching images from "libraryofbabel.info" (a site that generates combinations of pixel to generate every image possible).

The batch takes an input number and try to asynchronously fetch the image from the website.

When saving files, lot of them are corrupted and I do believe this has something to do with the async nature of the code or either.

I tried to put the Tasks in batches of 3 at time with Task.WhenAll() and even add a small delay to give it a rest, but it's not doing anything better.

I tried async way with await Task.WaitAll()

I made File writing async with WriteAllBytesAsync and WriteAllTextAsync and it looked like it slightly improved.

Believe me, I'm not posting here the code and saying: "Here guys, solve it for me, it's not workinggggggggg".

It's 4 days that I try to make this code flawless. I tried making it in WinForms and I had to give up. I am now trying to do it in console.

I tried everything I know of to make it smooth.

Errors I find: GenerateSeed() Generates a random numeric string of a certain lenght

often it gives "0" as result so i had to give it a retry mechanism. Don't get me wrong at putting var seed = "0"; at the beginning. It was giving me this error even before setting it manually. And it doesn't catch any exception. It just goes to 0...

Request: Often web request gives: An error requesting image has happened. Http status codee: 0, response message: The request timed-out.

I get some errors from the code and some from the serve

Error from server:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator at 
 <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5130353c383f113429303c213d347f323e3c">[email&#160;protected]</a> to inform them of the time this error occurred,
 and the actions you performed just before this error.</p>
<p>More information about this error may be available
in the server error log.</p>
<hr>
<address>Apache Server at babelia.libraryofbabel.info Port 443</address>
<script data-cfasync="false" src="/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js"></script></body></html>

Another:

<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>cloudflare</center>
</body>
</html>
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->

And well, I don't think I can't do anything about these.

And some like "cannot copy stream".

This leads me to think that it's both a problem of "flooding requests" and not too bright server side implementation.

The success rate is about 50%, ie if I put 100 images I get roughly about 50 images, the other fail for some reason or another.

I did all the debugging and analysis I could and I cannot find anything else.

Anyone can point me out if I'm using wrong the async/await functionality or if it's merely a matter of server errors on which I can't do nothing?

Here's the code (uses RestSharp nuget pkg):

using RestSharp;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

namespace BabeliaWandererConsole
{
    class Program
    {
        //Random nr generator
        private static readonly Random random = new Random();
        //About maxSeedLenght https://babelia.libraryofbabel.info/about.html
        //Calculated on Wolfram Alpha is 961756 max lenght of the number possible
        private const int maxSeedLenght = 961756;
        //Http request timeout in ms
        private const int requestTimeout = 70000;
        //Default output
        private static string outputDir = "tmp";

        static async Task Main(string[] args)
        {
            Console.WriteLine("Babelia Wanderer 1.0");
            Console.WriteLine("Please input how much images you want to generate, input q to quit");
            var input = Console.ReadLine();

            if (input == "q")
                return;

            Console.WriteLine("Output is default folder (will create a tmp folder where this executable is");
            Console.WriteLine("Do you want to change it? (type y or n)");
            var answer = Console.ReadLine();

            if (answer == "y")
            {
                Console.WriteLine("Please input a valid folder path");
                var path = Console.ReadLine();
                if (Directory.Exists(path))
                {
                    Console.WriteLine("Directory already exists, do you want to use it? (type y or n)");
                    answer = Console.ReadLine();
                    if (answer == "y")
                    {
                        outputDir = path;
                    }
                    else
                    {
                        Console.WriteLine("You typed n, application will now exit. Start again to choose a different folder");
                        return;
                    }
                }
                else
                {
                    Console.WriteLine("Directory does not exists, creating...");
                    try
                    {
                        outputDir = path;
                        if (!Directory.Exists(outputDir))
                            Directory.CreateDirectory(outputDir);
                    }
                    catch (Exception)
                    {
                        Console.WriteLine("Error creating directory. Application will now close");
                        return;
                    }
                }
            }

            Console.WriteLine("Validating input");
            if (!int.TryParse(input, out int imagesToGenerate))
            {
                Console.WriteLine("Input is not a valid number, quitting");
                Console.WriteLine("Press any key to quit");
                Console.ReadLine();
                return;
            }



            Console.WriteLine("Input is valid. Beginning");
            try
            {
                await Generate(imagesToGenerate);
            }
            catch (Exception)
            {

                throw;
            }
            Console.WriteLine("Generation finished, press any key to quit");
            Console.ReadLine();
        }

        public static async Task Generate(int imagesToGenerate = 1)
        {
            List<Task> tasks = new List<Task>();
            for (int i = 0; i < imagesToGenerate; i++)
            {
                var seed = "0";
                while (seed == "0")
                {
                    Console.WriteLine("Seed is 0, maybe some problem happened. Generating another seed");
                    seed = GenerateSeed();
                }
                tasks.Add(Task.Run(() => GenerateFromSeed(seed, i)));
            }

            //Processing list in batches of 3 images per time
            for (int k = 0; k < tasks.Count(); k += 3) 
            {
                var batch = tasks.Skip(k).Take(3);
                Task.WaitAll(batch.ToArray());
            }

        }

        public static async Task GenerateFromSeed(string seed, int loopIndex)
        {
            try
            {
                Console.WriteLine($"Generating image with seed {seed.Substring(0, 15)}........ (Total seed lenght: {seed.Length} charachters)");

                if (loopIndex > 15) //Adding wait time
                {
                    var waitTime = random.Next(1, Convert.ToInt32(Math.Round(loopIndex / 10d)));
                    Console.WriteLine($"Adding {waitTime} seconds wait time to avoid timeouts");
                    Thread.Sleep(waitTime * 1000);
                }

                var client = new RestClient("https://babelia.libraryofbabel.info/babelia.cgi");
                client.Timeout = requestTimeout;
                var request = new RestRequest(Method.POST);
                request.AddHeader("authority", "babelia.libraryofbabel.info");
                request.AddHeader("sec-ch-ua", "\"Google Chrome\";v=\"95\", \"Chromium\";v=\"95\", \";Not A Brand\";v=\"99\"");
                request.AddHeader("sec-ch-ua-mobile", "?0");
                client.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36";
                request.AddHeader("sec-ch-ua-platform", "\"Windows\"");
                request.AddHeader("content-type", "application/x-www-form-urlencoded");
                request.AddHeader("accept", "*/*");
                request.AddHeader("origin", "https://babelia.libraryofbabel.info");
                request.AddHeader("sec-fetch-site", "same-origin");
                request.AddHeader("sec-fetch-mode", "cors");
                request.AddHeader("sec-fetch-dest", "empty");
                request.AddHeader("referer", "https://babelia.libraryofbabel.info/slideshow.html");
                request.AddHeader("accept-language", "en-GB,en-US;q=0.9,en;q=0.8");
                request.AddParameter("location", seed);
                IRestResponse response = await client.ExecuteAsync(request);

                //Managing the bad request codes
                if (response.StatusCode != System.Net.HttpStatusCode.OK)
                {
                    Console.WriteLine($"An error requesting image has happened. Http status codee: {response.StatusCode}, response message: {response.ErrorMessage}");
                    return;
                }

                //Saving the files
                string FileName = Path.GetRandomFileName();
                Console.WriteLine($"Saving file {FileName}");
                await File.WriteAllBytesAsync($@"{outputDir}\{seed.Substring(0, 15)}_{FileName}{random.Next(1, 1000000)}.jpg", response.RawBytes);
                await File.WriteAllTextAsync($@"{outputDir}\{seed.Substring(0, 15)}_{FileName}{random.Next(1, 1000000)}.txt", seed);
            }
            catch (Exception)
            {
                throw;
            }
        }

        //Generates a seed, a string of a max lenght using random predefined characthers
        public static string GenerateSeed(int minSeedLenght = 1, int maxSeedLenght = 961756)
        {
            var seed = "0";
            try
            {
                var chars = "0123456789";
                seed = new string(Enumerable.Repeat(chars, random.Next(minSeedLenght, maxSeedLenght))
                    .Select(s => s[random.Next(s.Length)]).ToArray());
            }
            catch (Exception)
            {
                Console.WriteLine("Error generating seed");
            }
            return seed;
        }
    }
}
  • Two points. The `for (int i = 0; i < imagesToGenerate; i++)` loop is probably susceptible to this problem: [Captured variable in a loop in C#](https://stackoverflow.com/questions/271440/captured-variable-in-a-loop-in-c-sharp). And the `//Processing list in batches of 3 images per time` comment is incorrect. All images are processed concurrently at the same time, because all tasks are started at the same time. Awaiting the tasks in batches has no effect on the tasks. – Theodor Zoulias Nov 11 '21 at 13:59
  • You can look [here](https://stackoverflow.com/questions/10806951/how-to-limit-the-amount-of-concurrent-async-i-o-operations) for ways of limiting the amount of concurrent asynchronous operations. – Theodor Zoulias Nov 11 '21 at 14:05
  • 1
    @TheodorZoulias Ok, about the first thing, I read in the comments that after C# 5 they made breaking changes so that it works as expected but I will investigate it this is affecting my code. About the second, I will implement the async batch progress and see if I can achieve what you linked to me. Thank you very much for your help – Merdaiolo Nov 11 '21 at 15:43

0 Answers0