1

I have an array of strings. For each string I have to accomplish a task, and I want to do that in parallel. This is my piece of code:

    void Test()
    {
        sourcefiles.AsParallel().ForAll(MyMethod);
    }

    private void MyMethod(string source)
    {
       string tmp_string = RandomString(10);

       string path = Path.Combine(@"C:\myfolder", tmp_string);
       
      // (File operations on `path` go here)
    }

    private static Random random = new Random();

    public static string RandomString(int length)
    {
        const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

        return new string(Enumerable.Repeat(chars, length)
        .Select(s => s[random.Next(s.Length)]).ToArray());
    }

I receive the following exception: Cannot access C:\myfolder\ABCDEFG because it is in use by another process which is weird because the string "ABCDEFG" should be a random one generated by different processes. This is not an unfortunate event, it happens all the time. I also increased the complexity of RandomString(), but nothing happens, in fact I keep seeing from the debug that there are two different processes with the same tmp_string (other processes have a different one). Where's the mistake?

For context, this is what happens:

    PROCESS - RANDOM_STRING
    1937 - "r6MbODsNcF1654683907030"
    1374 - "MqrdQe386M1654683928872"  <---
    1518 - "iX33edEA5F1654683928873"
    1691 - "MqrdQe386M1654683928872"  <---
    1486 - "u46vqUrt601654684013613"

the first 10 alphanumeric characters are from the RandomString() function, the following digits represent the Unix's epoch

EDIT Some of you suggested that creating in rapid succession the class Random(), would generate the same seed. As some comment states, this is not true, at least in .NET (Core). It works fine either by workaround-ing it by using a lock, or by using the code in the accepted answer, since (as last as I understood) Guid is thread-safe.

Alessandro
  • 107
  • 1
  • 8
  • 2
    You're using `Random` incorrectly: the `Random` type is not thread-safe, but your `RandomString` function will be concurrently entered by multiple threads and so will break `Random` and corrupt its internal state: see here: https://stackoverflow.com/questions/3049467/is-c-sharp-random-number-generator-thread-safe – Dai Jun 08 '22 at 10:45
  • `Random` [is not thread safe](https://learn.microsoft.com/dotnet/api/system.random#ThreadSafety), so that code won't work correctly. Use `Random.Shared` if it's available, or any of the other approaches mentioned in the docs. Or even `Guid.NewGuid()`, for that matter, if you don't care about the exact format of the strings. – Jeroen Mostert Jun 08 '22 at 10:45
  • @JeroenMostert Or use a `lock` over it, I guess – Dai Jun 08 '22 at 10:46
  • Please provide a [mcve]. – Enigmativity Jun 08 '22 at 10:59
  • I tried your exact same code in LinqPad 7 (.NET 6, x64, both Release and Debug builds) and I could not reproduce duplicate output random strings, even when I intentionally had dozens of threads all mutating that single `Random` instance. – Dai Jun 08 '22 at 11:08
  • I just noticed you're talking about **processes** and showing us their pids, but `AsParallel()` doesn't create new processes at all, it uses the thread-pool - so I've no idea what you're doing at your end now... – Dai Jun 08 '22 at 11:11
  • @Dai NET 6's `Random` has been completely rewritten unless you specify a seed value (uses a completely different algorithm when no seed value is specified and a backwards-compatible one when one is provided). While I can't be sure (haven't tested) it wouldn't otherwise work, NET 6's Random is very different to previous ones so it might not be an accurate run (again, OP isn't mentioning his NET version) – Jcl Jun 08 '22 at 11:13
  • @Jcl thank you for letting me know (I learned something today!) - so I tried the same code in Linqpad 5 (for .NET Framework 4.8) and I couldn't repro it there either. – Dai Jun 08 '22 at 11:16
  • 1
    Would [Path.GetRandomFileName](https://learn.microsoft.com/en-us/dotnet/api/system.io.path.getrandomfilename) or [Path.GetTempFileName](https://learn.microsoft.com/en-us/dotnet/api/system.io.path.gettempfilename) help you? – Hans Kesting Jun 08 '22 at 11:21
  • 1
    @Dai as something interesting, I'm not sure the execution model of linqpad... but I just tried it on my Linqpad, and generating 300 random strings with `AsParallel()` (using a static random) works fine (no duplicates), but if I click "play" too long, no errors, but Random stops working and always returns 0 for a few seconds (then it goes back to working). Interesting nonetheless! [click "play" a few times](https://i.ibb.co/JdzpC4B/LINQPad6-ELQRc3-Abdd.png), [after some rapid clicks](https://i.ibb.co/CtFh3Xv/LINQPad6-DVJh922z-We.png)... goes back to normal if you let it "rest" for a few seconds – Jcl Jun 08 '22 at 11:32

4 Answers4

1

Your RandomString implementation is using the parameterless constructor of Random, which, if created in quick succession, uses the same seed and therefore produces the same output.

vzwick
  • 11,008
  • 5
  • 43
  • 63
  • 5
    His `Random` variable is static on his example (and link), so no new instances are being created. His problem is the non-thread-safety, not this one (which could be a very common one, for sure, but not in this case) – Jcl Jun 08 '22 at 10:54
  • 2
    @Jcl - I'm not sure. Until we see a [mcve] we can't know. – Enigmativity Jun 08 '22 at 10:59
  • 1
    FYI - this isn't true anymore starting with .net (core) - `In .NET Core, the default seed value is produced by the thread-static, pseudo-random number generator, so the previously described limitation does not apply. Different Random objects created in close succession produce different sets of random numbers in .NET Core.` - https://learn.microsoft.com/en-us/dotnet/api/system.random.-ctor?view=net-6.0#system-random-ctor= – Rand Random Jun 08 '22 at 11:33
  • 1
    @RandRandom definitely the best nickname for this thread! Also, thanks for the heads up! – vzwick Jun 08 '22 at 13:11
1

As you stated, you are running different processes with the above given code. And within this code you use a single static instance of Random. So far so good.

Unfortunately uses the Random class as initial seed the current time when created. So if you create multiple process in a tight loop, you got high chances, that two processes are using the same time as seed for the random class, which leads to producing the same numbers.

As you already shown in your code, it seems you already have access to the process id. So maybe you should generate your own seed value for the random class from the current time and the process id, maybe by calling something like this:

var seed = HashCode.Combine(DateTime.UtcNow, processId);
var random = new Random(seed);

Nevertheless, even if you can make a lower probability to hit the same value, you should check in your code if such a directory already exists and if yes, simply start a new try with the next random value.

And a last more simple trick:

If you don't care about the name of the folders, you can also use Guid.NewGuid().ToString() as your random folder name. The underlying implementation uses more sources for the seed then only time and the 128 bit range should move the probability low enough for your needs.

Oliver
  • 43,366
  • 8
  • 94
  • 151
  • `AsParallel` does not create new processes - it uses the thread-pool. .NET doesn't do POSIX-style `fork` for parallel/concurrent execution - and different processes don't share the same `static Random` instance - though you are very correct about how multiple processes of the same executable image that are started within a few hundred milliseconds of each other will likely share identical `Random` seeds and so would explain the OP's problem. – Dai Jun 08 '22 at 11:10
0

The Random instance is created many times in short period of time, using current time as the seed. Effectively, many sequences of pseudorandom numbers generated this way will likely be identical.

To make each instance of Random create a different stream of random values, you need a random seed for them - such as a GUID:

private static Random random = new Random(Guid.NewGuid().GetHashCode());

In this particular code sample, the same Random instance was used concurrently, which it is not safe for. You can modify your RandomString method to create a fresh Random instance every time it is used:

public static string RandomString(int length)
{
    Random rnd = new Random(Guid.NewGuid().GetHashCode());
    const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

    return new string(Enumerable.Repeat(chars, length)
    .Select(s => s[rnd.Next(s.Length)]).ToArray());
}
Zoran Horvat
  • 10,924
  • 3
  • 31
  • 43
  • "The Random instance is created many times in short period of time, using current time as the seed." - where in the OP's code is that happening? The `static Random random` field is only set once, and `new Random()` only appears once either. – Dai Jun 08 '22 at 11:15
  • Sorry, didn't read the class correctly - the problem seems about processes running concurrently. But in this case, it seems that `Random` was used from multiple threads, which is a fault. – Zoran Horvat Jun 08 '22 at 11:17
  • 1
    FYI - this isn't true anymore starting with .net (core) - `In .NET Core, the default seed value is produced by the thread-static, pseudo-random number generator, so the previously described limitation does not apply. Different Random objects created in close succession produce different sets of random numbers in .NET Core.` - https://learn.microsoft.com/en-us/dotnet/api/system.random.-ctor?view=net-6.0#system-random-ctor= – Rand Random Jun 08 '22 at 11:37
0

Random is not threadsafe! So your code is not correct since it uses a single random object from multiple threads concurrently. So the first step should be to ensure thread safety, either by using a lock or by creating a separate random object per thread. If you go for the later solution you need to ensure that the seed for each random object is different.

JonasH
  • 28,608
  • 2
  • 10
  • 23