2

I'm trying to start a number of threads at the same time but it is not working. Heres my code:

        var previous = 0;

        var threadList = new List<Thread> {};
        for (var i = (int)partSize; i <= responseLength; i = i + (int)partSize)
        {
            var t = new Thread(() => Download(URL, previous, i));
            //t.Name = i.ToString();
            threadList.Add(t);
            //t.Start();
            //t.Join(300);
            //new Thread(() => {Download(URL,previous, i);}).Start();
            //var t = Task.Factory.StartNew(() => Download(URL, previous, i));

            previous = i;
        }

        foreach (Thread t in threadList)
        {
            //Console.WriteLine(t.Name);
            t.Start();
        }

OUTPUT:

77296,86958
77296,86958
77296,86958
77296,86958
77296,86958
77296,86958
77296,86958
77296,86958

After displaying this for a while, it hangs up and eventually crashes.

Expected output with different code:

        for (var i = (int)partSize; i <= responseLength; i = i + (int)partSize)
        {
            var copy = previous;
            var t = new Thread(() => Download(URL, copy, i));
            t.Start();
            t.Join();

            previous = i;
        }

OUTPUT:

0,9662
9662,19324
19324,28986
28986,38648
38648,48310
48310,57972
57972,67634
67634,77296

As for the output the first number indicates where to start downloading a piece of a file and the second number indicates where to finish (in bytes) How can I start every thread in the list, with the parameters I am assigning to it? I am lost, so any help would be awesome. Thanks!

0xSingularity
  • 577
  • 6
  • 36
  • 3
    Use Async Tasks : https://msdn.microsoft.com/en-us/library/jj155756.aspx or this https://msdn.microsoft.com/en-us/library/system.threading.tasks.task%28v=vs.110%29.aspx – Saagar Elias Jacky Apr 24 '15 at 00:18
  • You are likely running into the issue described here: http://stackoverflow.com/questions/271440/captured-variable-in-a-loop-in-c-sharp – Mike Zboray Apr 24 '15 at 00:22
  • Also "not working" is a poor description of any problem. Describe the situation more specifically. A compiler error? An exception? Unexpected output? If there is unexpected output, describe what you expected to get. – Mike Zboray Apr 24 '15 at 00:25
  • ***new Thread() is a legacy already.*** Is it .Net 4.5 or above? What are you trying to achieve so that we can provide you a better solution? – Win Apr 24 '15 at 01:04
  • @Win I am trying to split a file into parts (each part is from a starting point to an ending point) and download each part. Then eventually joining it into a single file in the end. – 0xSingularity Apr 24 '15 at 01:08
  • 1
    @Chocoloco: Leaving aside the bug in your program, I am curious to know why you think that starting more threads is a good idea here. If I want to take $1000 out of a bank account, I don't hire ten guys to stand in line and take $100 out each, and then pool the money afterwards. When you are creating a thread, ask yourself "in the real world, would I hire new staff to do this job?" and if the answer is no, then don't hire a thread. Threads are expensive. – Eric Lippert Apr 24 '15 at 15:00
  • @EricLippert I agree, when it is only a $1000 dollars hiring 10 guys is useless but what if it is 1 million? 1 billion? The files I am downloading are going to be bigger than small pictures. Splitting the files into parts and downloading the parts async is really helping speed it up (well in my personal opinion). – 0xSingularity Apr 24 '15 at 15:26
  • 1
    @Chocoloco: That is very strange. Continuing our analogy, suppose we have ten guys and each is trying to withdraw $100K of our $1M. If there is only one bank teller, nine of those guys are going to be waiting. And if the bank limits the rate at which money can leave the building, even hiring more tellers doesn't help. The operating system limits the number of outbound connections (tellers) and queues up requests if there are extra requests. And adding more threads does not increase the bandwidth of your network card or the web site. – Eric Lippert Apr 24 '15 at 15:51
  • 2
    @Chocoloco: Let's step back. The key to performance analysis is *find the limited resource* and attack that. If CPU power is the limited resource, then you can buy a machine with four CPUs, put a thread on each CPU, and hey, you just got four times faster. But if bandwidth is the limited resource then throwing more threads at the problem makes it worse, not better; now you've got many idle workers all trying to use a saturated connection, and forcing each other to wait longer than they otherwise would. – Eric Lippert Apr 24 '15 at 15:54
  • @EricLippert I see what you are trying to say. And I think it makes a lot of sense. But why do many other download clients choose to split their files and download them? Such as Wgetfast, FDM, etc? There has to be some benefit for splitting. As for my case I have good CPU power but medium bandwidth (15 Mbps). I am only starting 5 threads per file so is the connection really that saturated? Plus I don't know if any of the worker threads are idle they are all downloading the part they are responsible for. – 0xSingularity Apr 24 '15 at 16:04
  • 2
    @EricLippert: There are two cases where having multiple threads downloading segments of a file is effective: [multipart download](https://en.wikipedia.org/wiki/Download_manager#Download_acceleration) for a single server and [segmented file transfer](https://en.wikipedia.org/wiki/Segmented_file_transfer) for multiple servers. – Luca Cremonesi Apr 24 '15 at 17:58

1 Answers1

2

Your lambda expression () => Download(URL, previous, i) is capturing the loop variable i. It's not the value that is captured, but the variable itself. So, by the time the call to Download is executed, i might already have changed. And the same problem exists with the previous variable, which is also shared between iterations of the loop.

The fix is to declare a variable inside the loop, copy the value of i to it, and use that variable in the lambda (and do the same for previous).

    for (var i = (int)partSize; i <= responseLength; i = i + (int)partSize)
    {
        var previous2 = previous;
        var i2 = i;
        var t = new Thread(() => Download(URL, previous2, i2));

See this article by Eric Lippert for a more complete explanation.

Thomas Levesque
  • 286,951
  • 70
  • 623
  • 758
  • Ok so the code fixed the loop issue but now it is hanging up and crashing after it executes. – 0xSingularity Apr 24 '15 at 00:49
  • Also where would recommend I use the t.join() statement? – 0xSingularity Apr 24 '15 at 00:55
  • @Chocoloco, I think you're starting too many threads. Use the ThreadPool instead, or Task.Run. – Thomas Levesque Apr 24 '15 at 00:56
  • @Chocoloco, if you `Join` the thread immediately after you started it, it means you're waiting for each thread to finish before you start another, so you're not running the operations in parallel anymore, you're running them sequentially. Probably not what you want... – Thomas Levesque Apr 24 '15 at 00:57
  • 1
    Anyway, using many threads to download a large file isn't going to be very useful. By default, .NET limits the number of connections to a given server to 2. So most of your threads will just wait for a connection to become available. You can increase the limit, but the server probably will also limit the number of connections from each client, so it won't really help. And anyway, your bandwidth isn't unlimited. If you're already using all the available bandwidth, running many downloads in parallel won't gain you anything. – Thomas Levesque Apr 24 '15 at 01:00
  • @ThomasLevesque: Plenty of servers use a per-connection bandwidth limit. You can work around this limit by using more connections. Of course, administrators are wise to this trick and may block users who use such techniques. – Brian Apr 27 '15 at 13:20