-3

I have to make a function that receives a row of numbers and returns the number which repeats twice. The row must always fit those rules:

  • numbers must be separated by comma row contains only two identical
  • numbers, other numbers are unique.

Row examples: "1,2,3,4,5,6,7,7", "10,10,12,15", "1,1,7,8,15" etc.

Here is my code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string q1 = "6,8,7,3,2,5,6";
            Console.WriteLine(my_funct(q1));
        }
        static int my_funct(string q)
        {
            string[] numbers = q.Split(',');
            List<string> numbers2 = new List<string>();
            int border = numbers.Count();
            for (int i = 0; i < border; i++)
            {
                if (numbers2.Contains(numbers[i]))
                {
                    return Convert.ToInt32(numbers[i]);
                }
                else numbers2.Add(numbers[i]);
            }
            return -1;
        }
    }
}

I want this function to execute as fast as possible to get the maximal performance? What should I change in my code, what other methods should I use to manage that?

Kalvis
  • 595
  • 2
  • 7
  • 13
  • 1
    Your code is hardly readable. Please paste the code into your question. – BlueM Mar 17 '14 at 15:17
  • None of this shows us how good or bad this code is performing, just time spent within its scope on certain calls; i.e. a highlight of the 'worst' parts: but even the longest part could be very quick, we don't know! do you actually have experience of it performing badly, with delays and whatnot? Also, if a method _needs_ to use `Contains`, say, and that's shown to be the bottleneck, it doesn't mean you mustn't use the method, that's just how fast it can be using it etc. – Grant Thomas Mar 17 '14 at 15:22
  • What horrible naming.. – Blindy Mar 17 '14 at 15:24
  • It's latvian, but does it really make any sense? – Kalvis Mar 17 '14 at 15:26
  • Your code as posted doesn't compile, and if the fact that it's Latvian mattered, we'd point you to a Latvian speaking forum. Wasn't even talking about that, what possible reason for naming your function with a trailing `_1` could you have? – Blindy Mar 17 '14 at 15:29
  • Just added full code, it should execute now. – Kalvis Mar 17 '14 at 15:42

3 Answers3

1

10% of your time is spent in a one-shot Convert.ToInt32, I would say you're wasting your time on this function.

As a side note, if we didn't know exactly what Convert.ToInt32 did to get a feel for those numbers, your entire image would be completely worthless. Execution counts, absolute time (wall clock time) spent on line etc, those are real metrics.

Edit: For fun, here's a one liner that replaces your function, actually compiles, and isn't a mess of repeated string comparisons and temporary lists:

    var s = "9;6,8,7,3,2,5,6,1,4";
    var res = s.Substring(s.IndexOf(';') + 1).Split(',')      // tokenize the list
        .Take(int.Parse(s.Substring(0, s.IndexOf(';'))))      // limit by the first number (it crashes in your case btw if there's no duplicate)
        .Select(i => int.Parse(i))                            // transform to numbers
        .GroupBy(i => i).Where(grp => grp.Count() > 1).Select(grp => grp.Key) // only takes into account repeated numbers
        .DefaultIfEmpty(-1).First();                          // -1 if nothing, otherwise first number
    Console.WriteLine(res);       // 6 in your case
Blindy
  • 65,249
  • 10
  • 91
  • 131
  • I know that `Int32.Parse` is faster than `Convert.ToInt32`... I'm quite new in using "performance analysis".. it would be grate if you could tell me where to find those 'real' parameters. – Kalvis Mar 17 '14 at 15:47
  • I'm using Microsoft Visual Studio 2012 Professional if that makes any sense. – Kalvis Mar 17 '14 at 15:55
1

Is it necessary to change the code until all lines are colored dull-red to get the maximum performance?

No. Think about it. You could make all lines the same colour by having them all take the same time to execute. One approach to this would be to make the very light red coloured lines slower, which is probably not what you want to do.

The degree of redness is proportional to the fraction of the total time that the line takes. In most codes there will be a variation from white (ie no red) to dark red; that is not a symptom of there being anything wrong. Red does not indicate badness it only indicates relative time.

Of course, having identified that one line is the slowest, the most red, it may be worth your while investigating ways to make it faster. Be prepared for that to result in another line becoming the most red when you next profile the code.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
  • But if I'm using ,for example, `Int32.Parse` instead of `Convert.ToInt32`, it allows my function to perform faster? – Kalvis Mar 17 '14 at 15:52
1

Your code is as fast as you can make it when you cannot find a way to make it faster. It has nothing to do with colors.

All a profiler can do is give you a general idea of what code accounts for what fraction of time. If a piece of code accounts for 50% of time, then if you could find a way to eliminate it, you would save 50%, or whatever the percent happens to be. If it is absolutely necessary to be doing that code, then it just is.

Don't just look at "self time". Especially as programs get larger, their call stacks get deeper, and it is just as easy to waste time by doing calls you could get rid of. If so, then those calls will be on the stack during the wasted time, so it is important to look at "inclusive time".

Be careful of so-called "CPU-profilers". They make the assumption that if you are doing any I/O it is simply unavoidable, so they ignore it. In fact, you may be doing I/O that you don't know about, and if you did you would know it could be avoided. This can be a huge performance drain.

That's why I use this method. It automatically handles inclusive and I/O time, at line-level resolution. It does not measure time precisely. Rather it measures time coarsely, but locates problems precisely.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135