3

I am trying to generate random data based on some criteria. I am stuck on how to start so I don't have any code.

I just want some guide on how I can achieve that. You need not provide complete code.

So, coming to the problem, lets say we have this existing data:

Total - 10
Won   -  7
Lost  -  3
Longest Winning Streak - 5
Longest Losing Streak - 2

Now I need to generate an array of random boolean values (true representing a win and false representing a loss) which fulfills the above criteria.

So, in this case the output can be any of the following:

0011011111
1111101100
1010011111
..........

Its the streak part that bothers me. If it weren't for the streak, I could have generated seven 1(s) and three 0(s) and then randomly shuffled them.

Note: I would prefer solutions in C#, VB.NET, JavaScript, or Python but any language is welcome.

Fᴀʀʜᴀɴ Aɴᴀᴍ
  • 6,131
  • 5
  • 31
  • 52
  • 2
    1. What have you tried already? 2. Any language is welcome? Bad idea, since we only solve [*special issues*](http://stackoverflow.com/help/mcve), if your question accept all the languages, then there's no *a correct answer*. – Remi Guan Jan 31 '16 at 16:09
  • 2
    one approach would be to create all the combinations of wins and loses then filter out the ones that don't match the streak criteria. – juharr Jan 31 '16 at 16:09
  • Should I then post this at Code Golf? – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:10
  • @KevinGuan If you would bother reading [this](http://stackoverflow.com/help/on-topic) you would see that this question isn't off-topic. And also asking about common programming problems/algorithms is allowed afaik – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:12
  • 1
    @KevinGuan I could certainly have made this C# only but since it is related with the fundamental data types and the .NET framework doesn't come to play, I thought it would be better to not make this language specific as it would provide a wider scope of being answered. – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:16
  • @FᴀʀʜᴀɴAɴᴀᴍ: So the problem is, if I posted an answer which is in Python, another one posted one in C# and they're both correct. Which one is the *correct answer* and you would accept? I didn't say your question is off-topic BTW, I meant **too broad**. – Remi Guan Jan 31 '16 at 16:19
  • Then I would request any answerer to post pseudo-code :) – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:20
  • @Bjørn-RogerKringsjå Do as you please as long as there's some explanation. I don't even need code. Just provide an algorithm. I don't want 12 bytes of Pyth not understanding what is going on :) – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:23
  • @FᴀʀʜᴀɴAɴᴀᴍ: Err...a pseudo-code answer isn't a really answer since the code *doesn't work*. – Remi Guan Jan 31 '16 at 16:23
  • Someone create a chatroom please. – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:23
  • create the win streak and the losing streak. Copy the win to a random start location, then the loosing; then fill in the blanks. The trick will be working out the range so that the longer streak leaves room for the shorter – Ňɏssa Pøngjǣrdenlarp Jan 31 '16 at 16:24
  • @Plutonix hmm. Good... better than criticizing my question. Any idea how I would fill in the gaps? – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:26
  • Thats not the hard part - start with an int array of 1s, the wins would reset to -1, losses to 0, the ones to fill in would be 1 - change them using Random.NextBoolean. then convert the array to a boolean array (off the top of my head) – Ňɏssa Pøngjǣrdenlarp Jan 31 '16 at 16:32
  • @Plutonix aha got it.. I'm kind of late in understanding things but thanks for help. 'working out the range so that the longer streak leaves room' - That's easy enough it seems. I'll now try and implement it in my program. – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:34
  • You dont really need a Win and Lose array, just a way to tell Wins(-1) from Losses(0) and unset(1). The win start index > 0 and < 10-win length, but may need to repick until there is room for (Loss Length) consecutive values – Ňɏssa Pøngjǣrdenlarp Jan 31 '16 at 16:39
  • @FᴀʀʜᴀɴAɴᴀᴍ If you change title and demand in python then i can post answer.. – Learner Jan 31 '16 at 16:45
  • Post it. The **first** correct answer will be accepted. Other languages will be upvoted as well if they are correct and posted within the next two days only. – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 16:46
  • @SIslam it already has a python tag – Ňɏssa Pøngjǣrdenlarp Jan 31 '16 at 16:47
  • something else to watch for is once you set the Wins, then the Losses, setting the Leftovers to NextBoolean could accidentally extend either of the streaks – Ňɏssa Pøngjǣrdenlarp Jan 31 '16 at 17:15
  • That is the problem I faced as well after trying your solution. – Fᴀʀʜᴀɴ Aɴᴀᴍ Jan 31 '16 at 17:16
  • just dont use Random.NextBoolean for leftovers - set it to the opposite of the value at the next index...unless it is the last index, then use the preceeding – Ňɏssa Pøngjǣrdenlarp Jan 31 '16 at 17:24

3 Answers3

3

I'd suggest a genetic algorithm.

    class Program
{
    static int len = 10;
    static int win = 7;
    static int lws = 5;
    static int lls = 2;

    static Random rnd = new Random();

    static void Main(string[] args)
    {
        int genSz = 15;
        var generation = new List<Chromosome>();
        Helper.Repeat(() => generation.Add(new Chromosome()), genSz);

        int gen = 1;
        while (generation.First().Fitness != 0)
        {
            //procreation
            Helper.Repeat(() => {
                int x1 = rnd.Next(genSz / 2);
                int x2 = rnd.Next(genSz);
                generation.Add(new Chromosome(generation[x1], generation[x2]));
            }, genSz);
            //selection
            generation = generation.OrderBy(x => x.Fitness).Take(genSz).ToList();
            Console.WriteLine("GENERATION " + gen++);
            foreach (var x in generation)
            {
                Console.WriteLine(x);
            }
            Console.ReadLine();
        }
        Console.ReadLine();
    }

    class Chromosome
    {
        bool[] genes = new bool[len];

        public Chromosome() { }

        public Chromosome(Chromosome p1, Chromosome p2)
        {
            //crossingover
            rnd.Shuffle(ref p1, ref p2);    //may reorder parents or not
            var x = rnd.Next(len);
            Array.Copy(p1.genes, 0, genes, 0, x);
            Array.Copy(p2.genes, x, genes, x, len - x);
            //mutation
            if (rnd.Flip())
            {
                x = rnd.Next(len);
                genes[x] = !genes[x];
            }
        }
        public int Fitness
        {
            get
            {
                int w = genes.Count(g => g);
                int l = len - w;
                int ws = genes.LongestStreak(g => g);
                int ls = genes.LongestStreak(g => !g);
                return Math.Abs(w - win) + Math.Abs(lws - ws) + Math.Abs(lls - ls);
            }
        }
        public override string ToString()
        {
            return "[" + new string(genes.Select(g => g ? '*' : '.').ToArray()) + "] " + Fitness.ToString();
        }
    }
}

public static class Helper
{
    public static bool Flip(this Random rnd) => rnd.Next(2) == 0;
    public static void Shuffle<T>(this Random rnd, ref T a, ref T b, bool allowNoChange = true)
    {
        if (allowNoChange && rnd.Flip()) return; //no reordering
        T tmp = a; a = b; b = tmp;
    }

    public static int LongestStreak<T>(this IEnumerable<T> sequence, Predicate<T> selector)
    {
        int current = 0;
        int longest = 0;
        foreach (T x in sequence)
        {
            if (selector(x))
            {
                current++;
                if (current > longest) longest = current;
            }
            else
            {
                current = 0;
            }
        }
        return longest;
    }

    public static void Repeat(this Action action, int N)
    {
        for (int n = 0; n < N; n++) action();
    }
}

enter image description here

Second variant - brute force. Can be used if the sequence is short. Also you can get all possible variants with it.

    class Program
{
    static void Main(string[] args)
    {
        var res = new[] { true, false }.Words(10).Where(w => {
            return
                w.Count(g => g) == 7 &&
                w.LongestStreak(g => g) == 5 &&
                w.LongestStreak(g => !g) == 2;
        });
        foreach (var v in res)
        {
            foreach (var b in v)
            {
                Console.Write(b ? "*" : ".");
            }
            Console.WriteLine();
        }
        Console.WriteLine(res.Count());
        Console.ReadLine();
    }
}

public static class Helper
{
    public static IEnumerable<IEnumerable<T>> Words<T>(this IEnumerable<T> alphabet, int len)
    {
        foreach (var l in alphabet)
        {
            if (len == 1)
            {
                yield return l.Yield();
            }
            else
            {
                foreach (var w in alphabet.Words(len - 1))
                {
                    yield return w.Prepend(l);
                }
            }
        }
    }

    public static IEnumerable<T> Yield<T>(this T item)
    {
        yield return item;
    }

    static IEnumerable<T> Prepend<T>(this IEnumerable<T> rest, T first)
    {
        yield return first;
        foreach (var item in rest)
            yield return item;
    }

    public static int LongestStreak<T>(this IEnumerable<T> sequence, Predicate<T> selector)
    {
        int current = 0;
        int longest = 0;
        foreach (T x in sequence)
        {
            if (selector(x))
            {
                current++;
                if (current > longest) longest = current;
            }
            else
            {
                current = 0;
            }
        }
        return longest;
    }
}

enter image description here

Mike Tsayper
  • 1,686
  • 1
  • 17
  • 25
1

My suggestion would be to use an algorithm to select k bits (your won number) from a length-n (your total number) string. Here, I use the kbits(n, k) function written by @foglebird. You can then filter out the undesired permutations using a list comprehension.

import itertools

def kbits(n, k):
    result = []
    for bits in itertools.combinations(range(n), k):
        s = ['0'] * n
        for bit in bits:
            s[bit] = '1'
        result.append(''.join(s))
    return result

total = 10
won = 7
lost = 3
max_win = 5
max_lose = 2

answer = [x for x in kbits(total, won) if '1'*(max_win+1) not in x and '0'*(max_lose+1) not in x]
Community
  • 1
  • 1
gtlambert
  • 11,711
  • 2
  • 30
  • 48
1

I had an answer posted, then noticed I was missed some key requirements. I added and changed some stuff to address those missing elements.

The core method fails most of the time, but it does so quickly enough that you can do it in a loop until you get a good answer. Depending on the actual values, in cases where there are very few legal results, it seems like you need to luck out.

The steps used:

  • Pick a random spot for the longest streak (Win in the example)
  • Bracket the streak with losses to prevent extending it when setting leftovers
  • Find the indices with enough consecutive slots to hold the loss streak
  • Pick a random one and set the loss streak (returns if there are none)
  • Set all the Leftovers as Not the value at n-1 to avoid extending or creating a new streak

So, it becomes hit or miss whether then WinCount and LossCount are correct. That seems easier to stumble upon than streaks of the right size. A wrapper method tests the result to reject and rerun. With the given values, it usually find a winner in the first 10 or so times.


The core method to construct a string representation, and a helper:

' ToDo change to return Bool() = string is easier to read
Private Function FarhamStreaks(winStrk As Int32, loseStrk As Int32, total As Int32) As String

    ' -1 == not set
    Dim result = Enumerable.Repeat(-1, total).ToArray

    ' set longest streak first
    Dim wNDX = RNG.Next(0, total + 1 - winStrk)
    For n As Int32 = 0 To winStrk - 1
        result(wNDX + n) = 1
    Next
    ' bracket with losers so the w streak cant extend
    If wNDX > 0 Then result(wNDX - 1) = 0
    If wNDX + winStrk < result.Length - 1 Then result(wNDX + winStrk) = 0

    ' look for eligible consecutive starting slots
    ' might be none
    Dim lossNdx As New List(Of Int32)
    For n As Int32 = 0 To result.Count - 1
        Dim count = CountConsecutiveLooserSlotsFrom(n, result)

        If (n + 1) < result.Count AndAlso count >= loseStrk Then
            lossNdx.Add(n)
        End If
    Next

    If lossNdx.Count = 0 Then
        ' do over 
        ' the code has never gotten here
        ' but depends on the mix of values
        Return ""
    End If

    ' set losses
    Dim lNdx = lossNdx(RNG.Next(0, lossNdx.Count))
    For n As Int32 = 0 To loseStrk - 1
        result(lNdx + n) = 0
    Next

    ' set the leftovers based on next value to avoid 
    ' extending streaks
    For n As Int32 = 0 To result.Length - 1
        If result(n) = -1 Then
            If n > 0 Then
                result(n) = If(result(n - 1) = 0, 1, 0)
            Else
                result(n) = If(result(n + 1) = 0, 1, 0)
            End If
        End If
    Next

    Dim resultString = String.Join(",", result)

    ' convert to boolean
    Dim realResult(total) As Boolean
    For n As Int32 = 0 To total - 1
        realResult(n) = Convert.ToBoolean(result(n))
    Next

    Return resultString
End Function

' find candidate slots for the shorter streak:
Private Function CountConsecutiveLooserSlotsFrom(ndx As Integer, theArray As Int32()) As Int32
    Dim count As Int32 = 1    ' including ndx

    For n As Int32 = ndx To theArray.Length - 2
        If theArray(n) <> 1 AndAlso theArray(n + 1) <> 1 Then
            count += 1
        Else
            Exit For
        End If
    Next
    Return count
End Function

The method to validate a result candidate (and performance metrics):

Private Function MakeFarhamStreak(wins As Int32, winStreak As Int32,
                                  lossStreak As Int32,
                                  total As Int32) As String
    Const MaxTries As Int32 = 999
    Dim losses = (total - wins)
    Dim reverse As Boolean = (lossStreak > winStreak)
    Dim candidate As String
    Dim sw As New Stopwatch
    Dim pass, fail As Int32
    Dim count As Int32

    sw.Start()

    For n As Int32 = 0 To MaxTries

        If reverse Then
            candidate = FarhamStreaks(lossStreak, winStreak, total)
            ' to do: un-reverse (Not) the results - 
        Else
            candidate = FarhamStreaks(winStreak, lossStreak, total)
        End If

        Dim result = candidate.Split(","c)

        ' test win count
        count = candidate.Where(Function(f) f = "1").Count
        If count <> wins Then
            fail += 1
            Continue For
        End If

        ' test loss count
        count = candidate.Where(Function(f) f = "0").Count
        If count <> losses Then
            fail += 1
            Continue For
        End If

        Dim tmp = candidate.Replace(","c, "")

        ' test win streak size
        Dim wstreaks = tmp.Select(Function(c, i) tmp.Substring(i).
                                      TakeWhile(Function(q) q = c AndAlso q = "1").
                                      Count()).
                                  Max
        If wstreaks <> winStreak Then
            fail += 1
            Continue For
        End If

        Dim lstreaks = tmp.Select(Function(c, i) tmp.Substring(i).
                          TakeWhile(Function(q) q = c AndAlso q = "0").
                          Count()).
                      Max
        If lstreaks <> lossStreak Then
            fail += 1
            Continue For
        End If

        pass += 1
        If pass = 1 Then
            Console.WriteLine("First Pass in {0}ms  (try # {1} = {2})",
                              sw.ElapsedMilliseconds, n, candidate)
            ' normally, return at this point
        End If
    Next

End Function

It is easier to fit the shorter streak around the longer one, so it reverses the parm order as needed. There isnt code to flip/Not the results.

results:

First Pass in 18ms (try # 4 = 1,1,1,1,1,0,0,1,0,1)
Total FAILURES 753 75.38%
Total Pass 247 24.72%
Total time for 999 candidates 29ms

It found the first passing value on try #4 - with the 10, 7w, 5ws, 2ls values it usually finds one in the first 10.

Ňɏssa Pøngjǣrdenlarp
  • 38,411
  • 12
  • 59
  • 178