0

I have made two tests because I wanted to test performance on two different implementations of trying to find a number in a string.

This is my code:

    [TestMethod]
    public void TestMethod1()
    {
        string text = "I want to find the number (30)";
        var startNumber = text.IndexOf('(');
        var trimmed = text.Trim(')');
        var number = trimmed.Substring(startNumber).Trim('(');

        Assert.AreEqual("30", number);
    }

    [TestMethod]
    public void TestMethod2()
    {
        string text = "I want to find the number (30)";
        var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
        var joined = string.Join("", lambdaNumber);

        Assert.AreEqual("30", joined);
    }

The result is that TestMethod2 (with the lamda expression) is faster than TestMethod1. According to the test explorer.

TestMethod1 = 2ms TestMethod2 = <1ms

If I try to add a StopWatch in each test, TestMethod1 is by far the fastest.

How can I properly test the performance of this behaviour?

EDIT:

I appreciate the fact that the methods do not perform the same operation. Therefore I created the following in stead:

    [TestMethod]
    public void TestMethod1()
    {
        var sw = new Stopwatch();
        sw.Start();

        var number = string.Empty;
        var counter = 0;
        while (counter < 100000)
        {
            number = string.Empty;
            string text = "I want to find the number (30)";
            foreach (var c in text.ToCharArray())
            {
                int outNumber;
                if (int.TryParse(c.ToString(), out outNumber))
                    number += c.ToString();
            }
            counter++;
        }

        sw.Stop();

        Assert.AreEqual("30", number);
    }

    [TestMethod]
    public void TestMethod2()
    {
        var sw = new Stopwatch();
        sw.Start();

        var joined = String.Empty;
        var counter = 0;
        while (counter < 100000)
        {
            string text = "I want to find the number (30)";
            var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
            joined = string.Join("", lambdaNumber);
            counter++;
        }

        sw.Stop();

        Assert.AreEqual("30", joined);
    }

According to the StopWatch the results are the following: TestMethod1 = 19ms TestMethod2 = 7ms

Thank you for all the replies

DevJoe
  • 419
  • 3
  • 17
  • 3
    Make a loop from 1 to 10000 execute both of them with stopwatch between every one of them and see which one is faster. – mybirthname Nov 16 '16 at 12:40
  • 5
    It seems unfair to compare them like that as the two methods are not the same, say given input like "10 20 (30)", then the first will return "30" but the second will return "102030". – stuartd Nov 16 '16 at 12:41
  • 4
    The first method searches for parenthesis. The second method searches for digits. You're comparing apples with tomaccos. –  Nov 16 '16 at 12:43
  • First method also creates new intermediary strings (Trim, Substring) while the latter only does this at the end (with .Join()). – Measurity Nov 16 '16 at 12:45
  • 2
    The first test it runs (MS test in the IDE) always seems to have some overhead compared to the second. As others have said, your tests do not measure the same type of operations at all, and you should measure time in a loop for both, to make the impact of random external factors smaller. – C.Evenhuis Nov 16 '16 at 12:46
  • As described I have already meassured the performance with a StopWatch in each method. The question might actually be if I should rely on the StopWatch value or the Test Explorer? – DevJoe Nov 16 '16 at 12:49
  • Performance includes both CPU and RAM metrics. GC behavior is another important measurement, and what makes the string splitting method dozens of times slower, even though that doesn't show when you only test a single call that doesn't have a chance to collect temporary strings – Panagiotis Kanavos Nov 16 '16 at 12:50
  • 2
    @Sasquatch you can't depend on either, because the entire test is wrong. Use a tool like BenchmarkDotNet to repeat the same test multiple times and collect important metrics like duration, RAM, allocations, GC cost. Personally, I'd use a Regex. It's both simpler and faster – Panagiotis Kanavos Nov 16 '16 at 12:51
  • @PanagiotisKanavos Thank you very much! I will look into that – DevJoe Nov 16 '16 at 12:58
  • 1
    To get the time a block of code takes to be executed you should ignore the time spent initializing, which TestExplorer is not doing. However I would rely on inserting a block of code into a loop and getting an average of the StopWatch results *without using tests*. Make sure to check the following question and follow the links in the comments - I found the professional tools (as @PanagiotisKanavos mentioned) and Rich Turner's comment very helpful. --> http://stackoverflow.com/questions/28334337/is-there-a-way-to-calculate-elapsed-time-for-test-methods-ignoring-time-spent-in – Daniel Tsvetkov Nov 16 '16 at 13:03
  • We do not know what overhead is produced by the `[TestMethod]` attribute. It may be that the first run gets some general overhead added. It might be interesting to create copies of the two tests, named `TestMethod03`, `TestMethod04` and so on up to `TestMethod50`, where the odd numbered tests are copies of `TestMethod1` and all the evens are copies of `TestMethod2`. However, the real issue is that the two methods do very different things; please read carefully the comments already made by other people. – AdrianHHH Nov 16 '16 at 13:13

1 Answers1

1

As I agree with most of the comments, I thought it might help to put up a test without unit tests. If you work with LINQ please use LINQPad (free standard edition) to run tests like this or other small code blocks. Here are tests, expanded to include Regex as well, and increased to 100000 loops.

void Main()
{
    string text = "I want to find the number (30)";

    Stopwatch sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod1();
    }

    sw.Elapsed.TotalMilliseconds.Dump("Substring no parameter");    
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod1(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("Substring parameter");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod2();
    }

    sw.Elapsed.TotalMilliseconds.Dump("LINQ no parameter");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod2(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("LINQ parameter");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod3(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("Regex In");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod4(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("Regex Out");
    sw = Stopwatch.StartNew();
    sw.Stop();
}

// Define other methods and classes here
public void TestMethod1()
{   
    string text = "I want to find the number (30)";
    var startNumber = text.IndexOf('(');
    var trimmed = text.Trim(')');
    var number = trimmed.Substring(startNumber).Trim('(');
}

public void TestMethod1(string text)
{
    var startNumber = text.IndexOf('(');
    var trimmed = text.Trim(')');
    var number = trimmed.Substring(startNumber).Trim('(');
}

public void TestMethod2()
{
    string text = "I want to find the number (30)";
    var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
    var joined = string.Join("", lambdaNumber);
}

public void TestMethod2(string text)
{   
    var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
    var joined = string.Join("", lambdaNumber);
}

public void TestMethod3(string text)
{
    var regex = new Regex(@"(\d+)");
    var match = regex.Match(text);
    var joined = match.Captures[0].Value;
}

public Regex regex = new Regex(@"(\d+)");

public void TestMethod4(string text)
{
    var match = regex.Match(text);
    var joined = match.Captures[0].Value;
}

And results:

Substring no parameter
11.3526 

Substring parameter
10.2901 

LINQ no parameter
60.2359 

LINQ parameter
56.5218 

Regex In
301.1179 

Regex Out
89.8345 

Conclusion? We are still comparing apples to oranges to diamonds. And regex does not seem to be as fast as some have suggested. Professional tools are the way to go.

CrnaStena
  • 3,017
  • 5
  • 30
  • 48