0

I've taken a permutation generator solution on @Artur Udod answer in this question:

Print all the possible combinations of "X" amount of characters with "X" string length (Brute Force)

I've adapted the code to return Strings instead a collection of Char andalso be able to specify whether character repetition is allowed or not on the generated permutations:

Public Shared Function PermuteCharacters(ByVal charSet As IEnumerable(Of Char),
                                         ByVal length As Integer,
                                         ByVal isRepetitionAllowed As Boolean) As IEnumerable(Of String)

    If length = 1 Then
        Return charSet.Select(Function(c As Char)
                                  Return New String(New Char() {c})
                              End Function)

    Else
        Return PermuteCharacters(charSet, length - 1, isRepetitionAllowed).
               SelectMany(Function(x As String) charSet,
                          Function(str As String, c As Char)

                              Select Case isRepetitionAllowed

                                  Case True
                                      Return str & c

                                  Case Else
                                      ' Firstly I need to check if string is empty because 
                                      ' String.Contains() will throw an exception on empty strings.
                                      ' This need to be fixed to avoid empty strings.
                                      If String.IsNullOrEmpty(str) Then
                                          Return Nothing
                                      End If

                                      If Not str.Contains(c) Then
                                          Return str & c
                                      Else
                                          Return Nothing
                                      End If

                              End Select

                          End Function)

    End If

End Function

An example usage:

Dim permutations As IEnumerable(Of String) =
    PermuteCharacters("123456789", 2, isRepetitionAllowed:=False)

The problem is that the function will create, parse, and return empty strings when I set repetition to False, with the negative point of performance that will cause in large set or permutation lengths.

I know that I could use IEnumerable.Distinct() method to remove all empty strings but one, but that will iterate again the entire big collection causing more negative performance on the code itself.

How can I design efficiently and properly the function, taking in count the performance, the overall execution time needed to create a collection of permutations?.

Its important to say that I don't see LINQ usage as a very disadvantage of performance, I will to keep using LINQ because it allows to develop a reduced code instead of the required thousands of lines for a common Loop to translate a LINQ query like that.

PS: My secondary goal is to implement the Iterator keyword on the function to improve more its performance, if someone could illustrate with a solution for this issue that also implements the Iterator capabilities will be more than awesome (and perfect).

Community
  • 1
  • 1
ElektroStudios
  • 19,105
  • 33
  • 200
  • 417
  • Can't you just add `.Where(Function(s) Not String.IsNullOrEmpty(s))` at the end? – Bjørn-Roger Kringsjå Feb 27 '15 at 17:36
  • Btw, you might find this old answer of mine interesting: http://stackoverflow.com/a/21090635/1842065 – Bjørn-Roger Kringsjå Feb 27 '15 at 17:49
  • 1
    @Bjørn-Roger Kringsjå the modification that you suggest will not avoid the 'uneedded' creation and parsing of empty strings in my `Select Case`, also, If I append a `WHERE` clausule I also need to add a comparison to return the strings (depending on `isRepetitionAllowed` like `.Where...( If(isRepetitionAllowed, Return Not String.IsNullOrEmpty(str), return str=str) )` ), and also, ¿a `WHERE` clausule will not iterate the collection again 'at the end'?. thanks anyways but a `WHERE` couldn't be in any way a solution that takes any advantage on the code's performance. – ElektroStudios Mar 01 '15 at 18:38
  • 1
    I deleted my answer. I usually don't do that. You need to improve your question. Add an example input/output. Good luck! – Bjørn-Roger Kringsjå Mar 01 '15 at 20:28
  • Assuming the existing proc *mostly* does what you want: `var = PermuteCharacters("0123456789ABCDEF", 6, False).Where(Function(x) x IsNot Nothing)` 5.7 million strings in about 7 ms. It starts to slow down for a larger string sets (like length = 8) partly because of the query work and surely because of storage issues. You cant possibly need 5.7 million perms though. – Ňɏssa Pøngjǣrdenlarp Mar 01 '15 at 20:46
  • 1
    @Plutonix That is pretty much what the first comment suggested. He said he isn't happy with that idea. @ ElektroStudios If you want to generate all combinations of a string of characters, i would go with a more traditional method. Not only will it be faster, but you will not have the issue you are having right now. At most, it would be double the amount of code you have right now. – Taekahn Mar 03 '15 at 02:09
  • 1
    The part that caught my eye in this question was *...of the required thousands of lines for a common Loop...*. I dont believe you actually mean it! A simple algorithm for taking all posible combinations whould probably take around 20 lines of code. I did write one in C and it was 25 lines(first try). My suggestion is the same with @Taekahn. Because you care about performance write your own algorithm. It will be way faster than the linq approach. – γηράσκω δ' αεί πολλά διδασκόμε Mar 04 '15 at 05:35
  • @Plutonix I dont know what computer you have but it took me 4.5 seconds! The C algorithm was around 380msec. – γηράσκω δ' αεί πολλά διδασκόμε Mar 04 '15 at 05:47

1 Answers1

1

I don't think you should start with linq since it doesn't look like you master it already. Perhaps try a simpler construct:

Private Shared Iterator Function BuildCombination(distinctChars As IEnumerable(Of Char), usedChars As Stack(Of Char), length As Integer, everyCharIsDistinct As Boolean) As IEnumerable(Of String)
' we give the method everything it needs to work
    Dim availableChars As IEnumerable(Of Char) = distinctChars
    ' what chars are available
    If everyCharIsDistinct Then
        availableChars = availableChars.Where(Function(c As Char) Not usedChars.Contains(c))
    End If
    ' if the string to return is of length 1, every available char can be returned directly
    If length = 1 Then
        For Each availableChar As Char In availableChars
            Yield New String(New Char()() = { availableChar })
        Next
    Else
        ' else we add each available char to the used list and we recurse to concat it with every possible substring
        For Each availableChar As Char In availableChars
            usedChars.Push(availableChar)
            For Each possibleSubstring As String In Program.BuildCombination(distinctChars, usedChars, length - 1, everyCharIsDistinct)
                Yield New String(New Char()() = { availableChar }) + possibleSubstring 
            Next
            usedChars.Pop()
        Next
    End If
    Return
End Function

Call it using this wrapper where we setup the lists and where we check for sane parameters:

Private Shared Sub FindCombinations(possibleChars As String, length As Integer, everyCharIsDistinct As Boolean)
    If possibleChars.Length = 0 Then
        Throw New InvalidOperationException()
    End If
    If everyCharIsDistinct AndAlso possibleChars.Length < length Then
        Throw New InvalidOperationException()
    End If
    Dim distinctChars As IEnumerable(Of Char) = possibleChars.Distinct(Of Char)()
    Dim listOfUsedChars As Stack(Of Char) = New Stack(Of Char)()
    For Each s As String In Program.BuildCombination(distinctChars, listOfUsedChars, length, everyCharIsDistinct).ToList(Of String)()
        Console.WriteLine(s)
    Next
End Sub
samy
  • 14,832
  • 2
  • 54
  • 82