3

Let's say i have a string like

string text = "hello dear";

Then I want to determine the longest repetition of coherented characters - in this case it would be ll. If there are more than one with the same count, take any.

I have tried to solve this with linq

char rchar = text.GroupBy(y => y).OrderByDescending(y => y.Count()).Select(x => x).First().Key;
int rcount = text.Where(x => x == rchar).Count();
string routput = new string(rchar, rcount);

But this returns ee. Am I on the right track?

Byyo
  • 2,163
  • 4
  • 21
  • 35
  • 3
    possible duplicate of [What is the most efficient way to detect if a string contains a number of consecutive duplicate characters in C#?](http://stackoverflow.com/questions/2671215/what-is-the-most-efficient-way-to-detect-if-a-string-contains-a-number-of-consec) – Kamil Budziewski Sep 07 '15 at 06:27
  • 1
    The problem with `GroupBy` is that you're counting the occurrences of each character _in the whole string_, whether they are contiguous or not. Since there's two `e` and they are the first character to appear twice in your text, `ee` is returned. – Pierre-Luc Pineault Sep 07 '15 at 06:32
  • @wudzik - all those answers are outdated or require a sequenceLength / threshold - which is not given. probably the question is duplicate but your duplicate suggestion doesn't really provide a suitable answer – fubo Sep 07 '15 at 06:36
  • @Pierre-LucPineault and is there a easy way to solve this? – Byyo Sep 07 '15 at 06:40

4 Answers4

4

If you prefer doing things with LINQ, you can do it fairly simply with a LINQ extension:

var text = "hello dear";
var result = string.Join("",
    text
    .GroupAdjacentBy((l, r) => (l == r)) /* Create groups where the current character is 
                                         Is the same as the previous character */
    .OrderByDescending(g => g.Count()) //Order by the group lengths
    .First()                           //Take the first group (which is the longest run)
);

With this extension (taken from Use LINQ to group a sequence of numbers with no gaps):

public static class LinqExtensions 
{
    public static IEnumerable<IEnumerable<T>> GroupAdjacentBy<T>(this IEnumerable<T> source, Func<T, T, bool> predicate)
    {
        using (var e = source.GetEnumerator())
        {
            if (e.MoveNext())
            {
                var list = new List<T> { e.Current };
                var pred = e.Current;
                while (e.MoveNext())
                {
                    if (predicate(pred, e.Current))
                    {
                        list.Add(e.Current);
                    }
                    else
                    {
                        yield return list;
                        list = new List<T> { e.Current };
                    }
                    pred = e.Current;
                }
                yield return list;
            }
        }
    }
}

Might be a bit overkill - but I've found GroupAdjacentBy quite useful in other situations as well.

Rob
  • 26,989
  • 16
  • 82
  • 98
4

Although a regex or custom Linq extension is fine, if you don't mind "doing it the old way" you can achieve your result with two temporary variables and a classic foreach loop.

The logic is pretty simple, and runs in O(n). Loop through your string and compare the current character with the previous one.

If it's the same, increase your count. If it's different, reset it to 1.

Then, if your count is greater than the previous recorded max count, overwrite the rchar with your current character.

string text = "hello dear";

char rchar = text[0];
int rcount = 1;

int currentCount = 0;
char previousChar = char.MinValue;
foreach (char character in text)
{
    if (character != previousChar)
    {
        currentCount = 1;
    }
    else
    {
        currentCount++;
    }

    if (currentCount >= rcount)
    {
        rchar = character;
        rcount = currentCount;
    }

    previousChar = character;
}

string routput = new string(rchar, rcount);

It's indeed verbose, but gets the job done.

Pierre-Luc Pineault
  • 8,993
  • 6
  • 40
  • 55
4

Another solution using LINQ:

string text = "hello dear";
string longestRun = new string(text.Select((c, index) => text.Substring(index).TakeWhile(e => e == c))
                                   .OrderByDescending(e => e.Count())
                                   .First().ToArray());

Console.WriteLine(longestRun); // ll

It selects a sequence of substrings starting with the same repeating character, and creates the result string with the longest of them.

w.b
  • 11,026
  • 5
  • 30
  • 49
2

RegEx would be a option

string text = "hello dear";
string Result = string.IsNullOrEmpty(text) ? string.Empty : Regex.Matches(text, @"(.)\1*", RegexOptions.None).Cast<Match>().OrderByDescending(x => x.Length).First().Value;
fubo
  • 44,811
  • 17
  • 103
  • 137