-2

Need to find the count of each consecutive characters in a row. Ex: aaaabbccaa output: 4a2b2c2a

Character may repeat but need to count only consecutive ones. I also need to maintain original sequence.

I tried following but it groups all characters so was not useful.

str.GroupBy(c => c).Select(g => new { g.Key, Count = g.Count() }).ToList().ForEach(x => str+= x.Count + "" + x.Key)
  • 2
    Does this answer your question? [Run-length encoding of a given string](https://stackoverflow.com/questions/27573521/run-length-encoding-of-a-given-string) – Robert McKee Feb 14 '20 at 18:02
  • @RobertMcKee Solution from link also gives output as 5a3b2c but I need 4a2b2c2a – Rajeeb Tamrakar Feb 14 '20 at 18:24
  • Shouldn't it be "4a2b2c2a" for "aaaabbccaa" string? – Dennis Feb 14 '20 at 18:24
  • Thanks @Dennis. Corrected my question. – Rajeeb Tamrakar Feb 14 '20 at 18:26
  • You ask for a solution using LINQ, and it is possible to create one, but the solution that uses LINQ is considerably harder to understand than a straightforward solution using enumerators. Can you say more about why you've decided ahead of time that LINQ is the correct solution to your problem? – Eric Lippert Feb 14 '20 at 18:35
  • @EricLippert Actually any efficient solution would work. I just didn't want to put loops. – Rajeeb Tamrakar Feb 14 '20 at 18:39
  • 3
    Now you are saying that you do not want to use loops; can you say why it is that you are resisting using simple, straightforward programming techniques to solve your problem? You seem to want to make this problem harder than it is for some reason, and it is not clear to me why. – Eric Lippert Feb 14 '20 at 18:40
  • As Eric pointed out, I wouldn't do this as LINQ. It's trivial as an iterator/Enumerable. – Robert McKee Feb 14 '20 at 18:45
  • @EricLippert I get your point and also agree with you. I did similar to what Dennis did in answer. I am just searching for any other efficient code as with loop it was taking long time for long string input. – Rajeeb Tamrakar Feb 14 '20 at 20:16
  • FYI, loops are always faster than LINQ as they avoid the overhead of lambda method calls. In general, you can expect that the longer your code is (and the simpler), the faster it will run. LINQ can express algorithms very concisely, so it can be cognitively efficient, but not necessarily CPU time efficient. How long a string are you talking that you think this matters? – NetMage Feb 14 '20 at 20:27
  • 1
    If your question is "how do I diagnose a performance problem?" then **ask that question**. – Eric Lippert Feb 14 '20 at 20:35
  • I would be interesting to know if the [regex solution](https://stackoverflow.com/a/60232866/18192) ends up actually being faster than the for loop solution rejected by OP. I generally take it for granted that a Regex is going to be slower than hand-parsing. – Brian Feb 14 '20 at 21:39
  • 1
    @Brian: There are limited cases where a regex can be winning in the amortized case because some regex engines let you compile a regex you are going to use frequently, and it then generates optimized code for matching the expression. But you are correct that the vast majority of time, one-off regex is going to be considerably slower. That said, fast enough is by definition fast enough; we should judge solutions by more metrics than just raw speed. – Eric Lippert Feb 14 '20 at 21:49
  • 1
    A quick benchmark shows that for a 10,000 char string, the regex method takes 1.09 times longer than my (quite generic) LINQ extension method (`GroupByRuns`), and is 12 times slower than a straight forward `for` loop implementation, though the regex can get a little closer with some minor optimization. – NetMage Feb 14 '20 at 23:46

7 Answers7

3

Here is a LINQ solution:

var input = "aaaabbccaa";
var result = string.IsNullOrEmpty(input) ? "" : string.Join("",input.Skip(1)
        .Aggregate((t:input[0].ToString(),o:Enumerable.Empty<string>()),
           (a,c)=>a.t[0]==c ? (a.t+c,a.o) : (c.ToString(),a.o.Append(a.t)),
           a=>a.o.Append(a.t).Select(p => $"{p.Length}{p[0]}")));

Here is the iterator solution:

var result = RleString("aaaabbccaa");

private static IEnumerable<(char chr, int count)> Rle(string s)
{
    if (string.IsNullOrEmpty(s)) yield break;

    var lastchar = s.First(); // or s[0]
    var count = 1;
    foreach (char letter in s.Skip(1))
    {
        if (letter != lastchar)
        {
            yield return (lastchar, count);
            lastchar = letter;
            count = 0;
        }
        count++;
    }
    if (count > 0)
        yield return (lastchar, count);
}
private static string RleString(string s)
{
    return String.Join("",Rle(s).Select(z=>$"{z.count}{z.chr}"));
}
Robert McKee
  • 21,305
  • 1
  • 43
  • 57
  • +1 for using `Aggregate` and +1 for using 4 argument version, but output should have count before letter, and using `o` for a tuple element name is just a bad idea. Or even single letters when they aren't clear. – NetMage Feb 14 '20 at 20:36
  • Yes, the LINQ version could be cleaned up considerably. I just tossed it together, but I wouldn't recommend using it. Updated to fix the output to have count then char. – Robert McKee Feb 14 '20 at 20:59
3

Regular expression to the rescue ?

var myString = "aaaabbccaa";

var pattern = @"(\w)\1*";
var regExp = new Regex(pattern);
var matches = regExp.Matches(myString);

var tab = matches.Select(x => String.Format("{0}{1}", x.Value.First(), x.Value.Length));
var result = String.Join("", tab);
user3041160
  • 684
  • 3
  • 5
  • Since `MatchCollection` does not implement `IEnumerable`, wouldn't you need to replace `matches.Select` with `matches.Cast().Select` ? – Brian Feb 14 '20 at 21:26
  • 2
    Also, for those wondering how this works: `(\w)` captures a single word character, `\1` represents a capturing group 1, `*` means 0 or more matches. Hence, `\1*` means, "0 or more copies of the character which matched `(\w)`." So, the regex forces each match to be a string of identical characters. The rest of the code is straightforward: It takes each match and prints out the matching first character and the match's length. – Brian Feb 14 '20 at 21:31
2

Non-LINQ solution (dotnetfiddle):

using System;
using System.Text;

public class Program
{
    public static void Main()
    {
        // produces 4a2b2c2a
        Console.WriteLine(GetConsecutiveGroups("aaaabbccaa"));
    }

    private static string GetConsecutiveGroups(string input)
    {       
        var result = new StringBuilder();
        var sb = new StringBuilder();

        foreach (var c in input)
        {
            if (sb.Length == 0 || sb[sb.Length - 1] == c)
            {
                sb.Append(c);
            }
            else
            {
                result.Append($"{sb.Length}{sb[0]}");
                sb.Clear();
                sb.Append(c);
            }
        }

        if (sb.Length > 0)
        {
            result.Append($"{sb.Length}{sb[0]}");
        }

        return result.ToString();
    }
}
Dennis
  • 37,026
  • 10
  • 82
  • 150
0

This small program will do the trick, but it's not a single line nice linq statement. Just my two cents.

using System;
using System.Linq;
using System.Collections.Generic;

public class Simple {

  public static void Main() {    

var text = "aaaabbccaa"; //output: 4a3b2c2a
var lista = new List<string>();
var previousLetter = text.Substring(1,1);
var item = string.Empty;
foreach (char letter in text)
{
    if (previousLetter == letter.ToString()){
        item += letter.ToString();          
    }
    else
    {
        lista.Add(item);
        item = letter.ToString();           
    }
    previousLetter = letter.ToString();
}
lista.Add(item);    
foreach (var i in lista)
     Console.WriteLine(i.Substring(1,1) + i.Select(y => y).ToList().Count().ToString());
  }
}
0

Here is my non-LINQ version that is quite fast compared to LINQ or Regex:

    var prevChar = str[0];
    var ct = 1;
    var s = new StringBuilder();
    var len = str.Length;
    for (int j2 = 1; j2 < len; ++j2) {
        if (str[j2] == prevChar)
            ++ct;
        else {
            s.Append(ct);
            s.Append(prevChar);
            ct = 1;
            prevChar = str[j2];
        }
    }
    s.Append(ct);
    s.Append(prevChar);
    var final = s.ToString();
}

My LINQ version looks like this, but uses a couple of extension methods I already had:

var ans = str.GroupByRuns().Select(s => $"{s.Count()}{s.Key}").Join();
NetMage
  • 26,163
  • 3
  • 34
  • 55
0
var chars = "aaaabbccaa".ToCharArray();
int counter = 1;
for (var i = 0; i < chars.Count(); i++)
{
    if (i + 1 >= chars.Count() || chars[i] != chars[i + 1])
    {
        Console.Write($"{counter}{chars[i]}");
        counter = 1;
    }
    else
    {
        counter++;
    }
}
Blu3
  • 41
  • 8
-1

You could have a character var and a counter var outside your Linq scope to keep track of the previous character and the current count and then use linq foreach, but I am just as curious as the rest to why you insist on doing this. Even if you do, the Solution may not be as easy to read as an iterative version and readability and maintenance overhead is very import if anyone else is ever going to read it.

Stephan Møller
  • 1,247
  • 19
  • 39