2

I'm trying to copy across 70 rows randomly selected based on certain criteria across to another sheet but ensuring only 70 unique rows exist in the second sheet once copied across.

My below code copies over the 70 rows correctly as per the required criteria but it's also copying across duplicate rows as there's no logic to select another row if there's a duplicate value in the array.

Any help on modifying the code to select another row if the row already exists in the array would be greatly appreciated.

I think I need to store the random selected rows and then check that the next selected row is not in that array already else select another row?

Sub MattWilliams()

    Dim rawDataWs As Worksheet, randomSampleWs As Worksheet
    Dim map, i As Long, n As Long, c As Long, rand, col
    Dim rng As Range
    Dim keyArr, nRowsArr

    Set rawDataWs = Worksheets("Master")
    Set randomSampleWs = Worksheets("Checks")

    randomSampleWs.UsedRange.ClearContents

    'EDIT: dynamic range in ColA
    Set rng = rawDataWs.Range("AT9:AT" & rawDataWs.Cells(Rows.Count, 1).End(xlUp).Row)

    Set map = RowMap(rng)

    keyArr = Array("ALS", "Customer") '<== keywords
    nRowsArr = Array(65, 5) '<== # of random rows

    Debug.Print "Key", "#", "Row#"
    For i = LBound(keyArr) To UBound(keyArr) 'loops through lower and upper bound of the KeyArr
        If map.exists(keyArr(i)) Then

            Set col = map(keyArr(i))
            n = nRowsArr(i)

            For c = 1 To n
                'select a random member of the collection
                rand = Application.Evaluate("RANDBETWEEN(1," & col.Count & ")")
                Debug.Print keyArr(i), rand, col(rand)

            If rawDataWs.Range("S" & col(rand)).Value = "FTF" Then

                 rawDataWs.Rows(col(rand)).Copy _
        randomSampleWs.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0)
    If col.Count = 0 Then
        If c < n Then Debug.Print "Not enough rows for " & keyArr(i)
        c = c - 1
    End If

Else
    c = c - 1


            End If
                'col.Remove rand 'remove the "used" row
                If col.Count = 0 Then
                    If c < n Then Debug.Print "Not enough rows for " & keyArr(i)

                End If
            Next c

        Else
            Debug.Print "No rows for " & keyArr(i)
        End If
    Next i
End Sub

'get a map of rows as a dictionary where each value is a collection of row numbers
Function RowMap(rng As Range) As Object
    Dim dict, c As Range, k
    Set dict = CreateObject("scripting.dictionary")
    For Each c In rng.Cells
        k = Trim(c.Value)
        If Len(k) > 0 Then
            If Not dict.exists(k) Then dict.Add k, New Collection
            dict(k).Add c.Row
        End If
    Next c
    Set RowMap = dict
End Function

If you need any more information please let me know

Regards,

Matt

Matt Williams
  • 81
  • 1
  • 9
  • Can you express this using SQL? Because you can issue `SELECT` statements against an Excel worksheet using ADO, and paste the results (in a `Recordset` object) into an Excel worksheet using the `CopyFromRecordset` method. – Zev Spitz Aug 23 '18 at 11:19
  • @ZevSpitz Unfortunately not, the data is entirely in Excel so VBA has to be used. Not ideal I know – Matt Williams Aug 23 '18 at 12:04
  • 1
    I suspect I wasn't clear enough. Even if the data is purely in Excel, you can run SQL statements against it from VBA, as long as the data is tabular, and you don't care about the formatting, only the values in the cells -- `SELECT DISTINCT F1, F2 FROM [Sheet1$] WHERE F3 > 5 GROUP BY f4`.. You might even be able to use `SELECT DISTINCT F1, F2 FROM [Sheet1$] IN ""c:\path\to\excel.xlsx"" ""Excel 12.0;""` – Zev Spitz Aug 23 '18 at 12:09
  • Some examples: [1](https://stackoverflow.com/a/51603572/111794), [2](https://stackoverflow.com/a/51603327/111794), [3](https://stackoverflow.com/a/51526756/111794). – Zev Spitz Aug 23 '18 at 12:15
  • @ZevSpitz I'll try this method out as well and see what happens. There seems multiple ways to solve my problem so it would be interesting to test them all out and see what works best for me. Thanks again for the help ! – Matt Williams Aug 23 '18 at 19:31

1 Answers1

1

You need to use an array of unique random numbers to assure that they are not the same. Unique random numbers function can be found here. (drop a upvote if useful)

Sub MattWilliams()

    Dim rawDataWs As Worksheet, randomSampleWs As Worksheet
    Dim map, i As Long, n As Long, c As Long, rand, col
    Dim rng As Range
    Dim keyArr, nRowsArr
    Dim samplepattern() As Long ' dim the samplepattern

    Set rawDataWs = Worksheets("Master")
    Set randomSampleWs = Worksheets("Checks")

    randomSampleWs.UsedRange.ClearContents

    'EDIT: dynamic range in ColA
    Set rng = rawDataWs.Range("AT9:AT" & rawDataWs.Cells(Rows.Count, 1).End(xlUp).Row)

    Set map = RowMap(rng)

    keyArr = Array("ALS", "Customer") '<== keywords
    nRowsArr = Array(65, 5) '<== # of random rows

    Debug.Print "Key", "#", "Row#"
    For i = LBound(keyArr) To UBound(keyArr) 'loops through lower and upper bound of the KeyArr
        If map.exists(keyArr(i)) Then

            Set col = map(keyArr(i))
            n = nRowsArr(i)
            '''''''''''''''''''''''''''''''''''''''''
            'solution starts here
            samplepattern = UniuqeRandom(1, col.Count,n) 'see link "here"

            For c = 1 To n
                Debug.Print keyArr(i), samplepattern(n), col(rand)

            If rawDataWs.Range("S" & col(samplepattern(n))).Value = "FTF" Then

                 rawDataWs.Rows(col(samplepattern(n))).Copy _
        randomSampleWs.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0)
        ' end of solution
        '''''''''''''''''''''''''''''''''''''''
    If col.Count = 0 Then
        If c < n Then Debug.Print "Not enough rows for " & keyArr(i)
        c = c - 1
    End If

Else
    c = c - 1


            End If
                'col.Remove rand 'remove the "used" row
                If col.Count = 0 Then
                    If c < n Then Debug.Print "Not enough rows for " & keyArr(i)

                End If
            Next c

        Else
            Debug.Print "No rows for " & keyArr(i)
        End If
    Next i
End Sub

So basically you get a set of random numbers, all unique before hand. Then you loop through your set and copy all rownumbers that are contained in that set.

example: samplepattern() = [2,3,7,17] are 4 unique random numbers between 1 and 20. Now I go ahead and loop through all members of samplepattern() and copy the rows(samplepattern(i)). So i copy row number 2,3,7 and 17.

  • Thank you for your reply. I didn't get it to work for me first time so i'm going to keep going with it until I get it. I need to read more on the Unique random numbers function as I think that's where I'm going wrong. I'll keep you posted though but i'll upvote both this answer and the link you gave as they are both really informative and will help a lot – Matt Williams Aug 23 '18 at 19:13
  • @mattwilliams If it is still not working update your question wil the current code an d psot the errormessages/ whats not working. – Lucas Raphael Pianegonda Aug 24 '18 at 06:31