-3

There's a piece of code I've been fiddling with to see if I can speed it up because it definitely shouldn't take 3.8 to 4.4 seconds to process about 1400 UserPrincipals.

I've used stopwatches to time different pieces of the function when I noticed that the pieces don't add up to the timer measuring the entire function. The gap is relatively dramatic and I can't figure out where it's coming from. I'd appreciate any insight.

The SamAccountName measurement is because I found some info that it can potentially be slow because of how the getter of the property is implemented, for anyone wondering.

Function code:

private static ConcurrentDictionary<string, User> ProcessUsers(List<UserPrincipal> users)
    {
        ConcurrentDictionary<string, User> toReturn = new ConcurrentDictionary<string, User>(4, 2000);
        Regex regex = new Regex(@"^[a-zA-Z]{3}(\d{4})$");         

        Parallel.ForEach(users, user =>
        {
            Stopwatch timerSamAccountName = new Stopwatch();
            Stopwatch timerIfBlock = new Stopwatch();
            Stopwatch timerTotal = new Stopwatch();

            timerTotal.Start();

            timerSamAccountName.Start();
            string guid = user.SamAccountName;
            timerSamAccountName.Stop();
            Console.WriteLine($"SamAccountName done in {timerSamAccountName.Elapsed} for {user.Name}");

            timerIfBlock.Start();
            if (regex.IsMatch(guid))
            {
                User newUser = new User(user.Name, guid.ToUpper(), user.EmailAddress, user); 
                toReturn.TryAdd(newUser.ID, newUser);
            }
            timerIfBlock.Stop();
            Console.WriteLine($"IF block done in {timerIfBlock.Elapsed} for {user.Name}");
            
            timerTotal.Stop();
            Console.WriteLine($"User processed in {timerTotal.Elapsed} for {user.Name}");
        });
        return toReturn;
    }

User class:

internal class User
{
    public string name { get; set; }
    public string ID { get; set; }
    public string email { get; set; }
    public UserPrincipal? activeDirectoryHandle { get; set; }

    public User(string inName = "None", string inID = "None", string inEmail = "None", UserPrincipal? adHandle = null)
    {
        name = inName;
        ID = inID;
        email = inEmail;
        activeDirectoryHandle = adHandle;
    }
}

Example of the discrepancy:

sample

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
mvi2110
  • 31
  • 5
  • 1
    I'm not sure what discrepancy you are looking at here? – DavidG Nov 04 '22 at 15:30
  • 1
    Console.WriteLine can be slow, try moving all the Console.WriteLines to the end of the function – berkeleybross Nov 04 '22 at 15:31
  • @DavidG First and second timers should be adding up to the third timer. – mvi2110 Nov 04 '22 at 15:31
  • https://stackoverflow.com/questions/394020/how-accurate-is-system-diagnostics-stopwatch : for high accuracy use Benchmark.net – Mitch Wheat Nov 04 '22 at 15:32
  • 7
    But you do work in the middle of writing to console, why would you expect that to take zero time? – DavidG Nov 04 '22 at 15:32
  • You don't need 3 Stopwatches, because you can query the `Elapsed` property while the Stopwatch is running. Also, store the results in `TimeSpan` variables and write to the Console once all the measurements have been completed. – Olivier Jacot-Descombes Nov 04 '22 at 15:50
  • @OlivierJacot-Descombes To use a single stopwatch to get the time of each section you'd need to store the elapsed time at the start, the end, and then compute the difference. Thant's not really any less work than just starting and stopping a stopwatch. – Servy Nov 04 '22 at 16:07
  • But it is more precise, since instantiating, starting and stopping the watches takes some time. Just writing `t2 - t1` is not a big effort. The minus operator is overloaded in `TimeSpan`. – Olivier Jacot-Descombes Nov 04 '22 at 16:20
  • 1
    @OlivierJacot-Descombes Starting and then stopping a stopwatch is basically exactly as much work as getting the current time twice and getting the difference. That's pretty much exactly what the stopwatch is doing internally, it's just abstracting the mechanism away for you. So no, you're not saving work, you're doing exactly the same thing, but without encapsulating the logic for it. – Servy Nov 04 '22 at 16:38
  • As a side note, stopwatches are not timers. A timer is a component that raises an event (or invokes a callback) at a specific time or after a specific interval. A stopwatch doesn't raise any events. It just waits passively for you to stop it, and ask it how much time passed between start and stop. – Theodor Zoulias Nov 05 '22 at 10:27

1 Answers1

1

The timers don't add up because you're doing work in places where some of the timers (timerSamAccountName and timerIfBlock) aren't running. Additionally, the work you're doing outside the timers (string interpolation and writing to the console) are the longest-running operations in the whole example. You've pretty much measured how much slower these operations are, in fact.

madmonk46
  • 421
  • 3
  • 10
  • String interpolation is definitely not slow, and is almost certainly dwarfed bt simply writing to the console. Either way, the question is really a no-repro so should be closed, not answered. – DavidG Nov 04 '22 at 15:49
  • 1
    @DavidG It's all relative. The string concatenation is being to compared to operations such as *assigning a single variable*. Compared to that, constructing a new string is orders of magnitude slower. Yes, the IO is going to be even more orders of magnitude slower still, but this is why saying things like "X is slow" or X is fast" don't really mean much in a vacuum. – Servy Nov 04 '22 at 15:53