0

I'm trying to split 700,000 rows into about 27 different tabs, based on manager name. This is obviously a large amount of data and excel runs out of memory and only manages to put across about 100 lines into 1 tab

Does anyone have any idea on how to make the code below more efficient or a different way of getting around running out of memory

Maybe sorting the data first and then cutting and pasting into their own tabs? I'm not sure

Current code:

Sub parse_data()
Dim lr As Long
Dim ws As Worksheet
Dim vcol, i As Long
Dim icol As Long
Dim myarr As Variant
Dim title As String
Dim titlerow As Integer
vcol = 19
Set ws = Sheets("FCW")
lr = ws.Cells(ws.Rows.Count, vcol).End(xlUp).Row
title = "A1:T1"
titlerow = ws.Range(title).Cells(1).Row
icol = ws.Columns.Count
ws.Cells(1, icol) = "Unique"
For i = 2 To lr
On Error Resume Next
If ws.Cells(i, vcol) <> "" And Application.WorksheetFunction.Match(ws.Cells(i, vcol), ws.Columns(icol), 0) = 0 Then
ws.Cells(ws.Rows.Count, icol).End(xlUp).Offset(1) = ws.Cells(i, vcol)
End If
Next
myarr = Application.WorksheetFunction.Transpose(ws.Columns(icol).SpecialCells(xlCellTypeConstants))
ws.Columns(icol).Clear
For i = 2 To UBound(myarr)
ws.Range(title).AutoFilter field:=vcol, Criteria1:=myarr(i) & ""
If Not Evaluate("=ISREF('" & myarr(i) & "'!A1)") Then
Sheets.Add(after:=Worksheets(Worksheets.Count)).Name = myarr(i) & ""
Else
Sheets(myarr(i) & "").Move after:=Worksheets(Worksheets.Count)
End If
ws.Range("A" & titlerow & ":A" & lr).EntireRow.Copy Sheets(myarr(i) & "").Range("A1")
Sheets(myarr(i) & "").Columns.AutoFit
Next
ws.AutoFilterMode = False
ws.Activate
End Sub
Ashley Thomson
  • 37
  • 1
  • 10
  • Maybe have a look at VBA Practices, in particular [Switching off functionality](http://stackoverflow.com/documentation/excel-vba/1107/vba-best-practices/5925/switch-off-functionality-during-macro-execution#t=201611240913299254555). – Clusks Nov 24 '16 at 09:14
  • I tried using the SpeedUp snippet of code and it still only manages to copy around 8000 rows into the first managers tab. Then just running out of memory – Ashley Thomson Nov 24 '16 at 09:26
  • Are you running 32 bit Excel? That is more prone to memory issues than 64 bit, I believe. You could try splitting this into functions that can be called from within a sub, as I believe that will help. There's also [this suggestion](http://stackoverflow.com/a/33828800/6204844) regarding saving the Workbook often, as it purges the memory, although I can't vouch for that – Clusks Nov 24 '16 at 09:33

3 Answers3

1

Wow. Lots and lots of comments here. @OP, did you ever get this working? If you are still looking for a solution, try this.

Sub Copy_To_Worksheets()
'Note: This macro use the function LastRow
    Dim My_Range As Range
    Dim FieldNum As Long
    Dim CalcMode As Long
    Dim ViewMode As Long
    Dim ws2 As Worksheet
    Dim Lrow As Long
    Dim cell As Range
    Dim CCount As Long
    Dim WSNew As Worksheet
    Dim ErrNum As Long

    'Set filter range on ActiveSheet: A1 is the top left cell of your filter range
    'and the header of the first column, D is the last column in the filter range.
    'You can also add the sheet name to the code like this :
    'Worksheets("Sheet1").Range("A1:D" & LastRow(Worksheets("Sheet1")))
    'No need that the sheet is active then when you run the macro when you use this.
    Set My_Range = Range("A1:D" & LastRow(ActiveSheet))
    My_Range.Parent.Select

    If ActiveWorkbook.ProtectStructure = True Or _
       My_Range.Parent.ProtectContents = True Then
        MsgBox "Sorry, not working when the workbook or worksheet is protected", _
               vbOKOnly, "Copy to new worksheet"
        Exit Sub
    End If

    'This example filters on the first column in the range(change the field if needed)
    'In this case the range starts in A so Field:=1 is column A, 2 = column B, ......
    FieldNum = 1

    'Turn off AutoFilter
    My_Range.Parent.AutoFilterMode = False

    'Change ScreenUpdating, Calculation, EnableEvents, ....
    With Application
        CalcMode = .Calculation
        .Calculation = xlCalculationManual
        .ScreenUpdating = False
        .EnableEvents = False
    End With
    ViewMode = ActiveWindow.View
    ActiveWindow.View = xlNormalView
    ActiveSheet.DisplayPageBreaks = False

    'Add a worksheet to copy the a unique list and add the CriteriaRange
    Set ws2 = Worksheets.Add

    With ws2
        'first we copy the Unique data from the filter field to ws2
        My_Range.Columns(FieldNum).AdvancedFilter _
                Action:=xlFilterCopy, _
                CopyToRange:=.Range("A1"), Unique:=True

        'loop through the unique list in ws2 and filter/copy to a new sheet
        Lrow = .Cells(Rows.Count, "A").End(xlUp).Row
        For Each cell In .Range("A1:A" & Lrow)

            'Filter the range
            My_Range.AutoFilter Field:=FieldNum, Criteria1:="=" & _
             Replace(Replace(Replace(cell.Value, "~", "~~"), "*", "~*"), "?", "~?")

            'Check if there are no more then 8192 areas(limit of areas)
            CCount = 0
            On Error Resume Next
            CCount = My_Range.Columns(1).SpecialCells(xlCellTypeVisible) _
                     .Areas(1).Cells.Count
            On Error GoTo 0
            If CCount = 0 Then
                MsgBox "There are more than 8192 areas for the value : " & cell.Value _
                     & vbNewLine & "It is not possible to copy the visible data." _
                     & vbNewLine & "Tip: Sort your data before you use this macro.", _
                       vbOKOnly, "Split in worksheets"
            Else
                'Add a new worksheet
                Set WSNew = Worksheets.Add(After:=Sheets(Sheets.Count))
                On Error Resume Next
                WSNew.Name = cell.Value
                If Err.Number > 0 Then
                    ErrNum = ErrNum + 1
                    WSNew.Name = "Error_" & Format(ErrNum, "0000")
                    Err.Clear
                End If
                On Error GoTo 0

                'Copy the visible data to the new worksheet
                My_Range.SpecialCells(xlCellTypeVisible).Copy
                With WSNew.Range("A1")
                    ' Paste:=8 will copy the columnwidth in Excel 2000 and higher
                    ' Remove this line if you use Excel 97
                    .PasteSpecial Paste:=8
                    .PasteSpecial xlPasteValues
                    .PasteSpecial xlPasteFormats
                    Application.CutCopyMode = False
                    .Select
                End With
            End If

            'Show all data in the range
            My_Range.AutoFilter Field:=FieldNum

        Next cell

        'Delete the ws2 sheet
        On Error Resume Next
        Application.DisplayAlerts = False
        .Delete
        Application.DisplayAlerts = True
        On Error GoTo 0

    End With

    'Turn off AutoFilter
    My_Range.Parent.AutoFilterMode = False

    If ErrNum > 0 Then
        MsgBox "Rename every WorkSheet name that start with ""Error_"" manually" _
             & vbNewLine & "There are characters in the name that are not allowed" _
             & vbNewLine & "in a sheet name or the worksheet already exist."
    End If

    'Restore ScreenUpdating, Calculation, EnableEvents, ....
    My_Range.Parent.Select
    ActiveWindow.View = ViewMode
    With Application
        .ScreenUpdating = True
        .EnableEvents = True
        .Calculation = CalcMode
    End With

End Sub


Function LastRow(sh As Worksheet)
    On Error Resume Next
    LastRow = sh.Cells.Find(What:="*", _
                            After:=sh.Range("A1"), _
                            Lookat:=xlPart, _
                            LookIn:=xlValues, _
                            SearchOrder:=xlByRows, _
                            SearchDirection:=xlPrevious, _
                            MatchCase:=False).Row
    On Error GoTo 0
End Function

I just tested the functionality by putting =randbetween(1,27) from A1:A700000. The script did everything in less than 30 seconds on my very old ThinkPad with 12GB RAM.

  • Hi there, I haven't managed to get the other 2 answers working with my data yet. Just have a quick question regarding yours, probably sound stupid but here goes - The column I need it to look at for which manager name it needs to split the data by is S so column 19 and the total data set goes from columns A to T. I'm not quite sure which bits of your code to amend to meet this requirement...I'm still very new to vba, its quite scary lol – Ashley Thomson Dec 01 '16 at 15:53
  • Actually @queens I think I've got it to work! thank you so much for this answer, such a massive help with what I'm trying to do, couldn't thank you enough! – Ashley Thomson Dec 02 '16 at 15:14
0

Edit2: Added a loop over manager names stored in a string.

i. In general, turning off screen updating in Excel can speed things up.

    On Error Goto skpError
    Application.ScreenUpdating = False    

    ' your code....

skpError:
    Application.ScreenUpdating = True

ii. If you consider a major overhaul, the following could provide a starting point.

I used simplified sample data like this

manager revenue
Henry   500
Henry   500
Willy   500
Willy   500
Billy   500
Billy   500

In short, it does the following:

  • it reads your data into a recordset
  • it filters the recordset based on the manager-name
  • it copies the records from the recordset to the sheet with the manager-name
  • since it doens't explicitly loop every row, it should perform considerably faster than what you had so far

Hope that helps!

Sub WorkWithRecordset()

Dim ws As Worksheet
Dim iCols As Integer

' 1. Reading all the data into a recordset
Dim xlXML As Object
Dim rst As Object

Set rst = CreateObject("ADODB.Recordset")
Set xlXML = CreateObject("MSXML2.DOMDocument")

xlXML.LoadXML ThisWorkbook.Sheets("Data").UsedRange.Value(xlRangeValueMSPersistXML)
rst.Open xlXML

' 2. manager names -  we could also put those into a recordset (similar to above)
'    for showing reasons i use an array here
'    note: i use 2 Variant variables, so I can loop over the arrays-entries without using LBOUND() to UBOUND()

 Dim varManager As Variant
 varManager = Split("Billy;Willy;Henry", ";")


 ' 3. loop over the managers
 Dim manager As Variant
 For Each manager In varManager

    ' set the outputsheet
    Set ws = ThisWorkbook.Sheets(manager)

    ' set the filter on managername
    rst.Filter = "manager = '" & manager & "'"


    With ws

        ' Print the headers
        For iCols = 0 To rst.Fields.Count - 1
            .Cells(1, iCols + 1).Value = rst.Fields(iCols).Name
        Next

        ' Print the data
        .Range("A2").CopyFromRecordset rst

    End With

    ' delete the filter
    rst.Filter = ""

Next manager
' end of manager-loop


Debug.Print "Done. Time " & Now

End Sub


Function GetRecordset(rng As Range) As Object

'Recordset ohne Connection:
'https://usefulgyaan.wordpress.com/2013/07/11/vba-trick-of-the-week-range-to-recordset-without-making-connection/

Dim xlXML As Object
Dim rst As Object

Set rst = CreateObject("ADODB.Recordset")
Set xlXML = CreateObject("MSXML2.DOMDocument")
xlXML.LoadXML rng.Value(xlRangeValueMSPersistXML)

rst.Open xlXML

Set GetRecordset = rst

End Function

Note:

a) the code assumes that there are existing, empty sheets called "Henry", "Billy", "Willy"

b) with 27 sheets you could create manager-sheets dynamically, if they don't already exist

c) i copied the entire rows. if you only need a selection of fields, you could still loop the filtered recordset and access single fields with something like rst!manager

Martin Dreher
  • 1,514
  • 2
  • 12
  • 22
  • Thank you very much for providing a starting point. I'll attempted to recreate and understand this with my code. But some assistance with getting it to loop through etc would be greatly appreciated, quite new to VBA unfortunately – Ashley Thomson Nov 24 '16 at 10:32
  • do you know in advance which manager-names are in your data? Meaning: is there a separate list with all the distinct manager names? That would simplify things. – Martin Dreher Nov 24 '16 at 10:44
  • There are 27 Names that I know will be in there, and we are informed about any changes so it should be okay – Ashley Thomson Nov 24 '16 at 11:51
  • Thanks very much for providing this method, I've been playing around with it and excel still runs out of memory when trying to put all the data in the record sheet – Ashley Thomson Nov 29 '16 at 11:44
0

My test stub has 700,000 Rows and 20 Columns of data, 100MB on disk. It takes 6.5 Seconds to parse the data into 27 different worksheets. I'm pretty happy with the results considering it takes 26 Seconds to save the file.

enter image description here

Class Module: ManagerClass

Option Explicit
'Adjust MAXROWS if any Manage will have more than 60000
Private Const MAXROWS As Long = 60000
Private Data
Private m_Manager As String
Private m_ColumnCount As Integer
Private m_Header As Range
Private x As Long
Private y As Integer

Public Sub Init(ColumnCount As Integer, Manager As String, Header As Range)
    m_Manager = Manager
    m_ColumnCount = ColumnCount
    Set m_Header = Header
    ReDim Data(1 To MAXROWS, 1 To ColumnCount)
    x = 1
End Sub

Public Sub Add(Datum As Variant)
    y = y + 1
    If y > m_ColumnCount Then
        y = 1
        x = x + 1
    End If
    Data(x, y) = Datum
End Sub

Private Sub Class_Terminate()
    Dim wsMGR As Worksheet
    If Evaluate("=ISREF('" & m_Manager & "'!A1)") Then
        Set wsMGR = Worksheets(m_Manager)
        wsMGR.Cells.Clear
    Else
        Set wsMGR = Sheets.Add(after:=Worksheets(Worksheets.Count))
        wsMGR.Name = m_Manager
    End If
    wsMGR.Range(m_Header.Address) = m_Header
    wsMGR.Range("A2").Resize(x, m_ColumnCount).Value = Data
End Sub

Standard Module: ParseData

Sub ParseData()
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual

    Const MGRCOLUMN As Integer = 19
    Const HEADERROW As String = "A1:T1"
    Dim Data, MGRData
    Dim key As String
    Dim MGRClass As ManagerClass
    Dim x As Long, y As Long
    Dim dicMGR As Object
    Set dicMGR = CreateObject("Scripting.Dictionary")

    Dim lastRow As Long, z As Long, z2 As Long

    With Sheets("FCW")
        lastRow = .Cells(.Rows.Count, MGRCOLUMN).End(xlUp).Row

        For z = 2 To lastRow Step 10000
            z2 = IIf(z + 10000 > lastRow, lastRow, z + 10000)

            Data = .Range(Cells(z, 1), .Cells(z2, MGRCOLUMN + 1))
            For x = 1 To UBound(Data, 2)
                key = Data(x, MGRCOLUMN)
                If Not dicMGR.Exists(key) Then
                    Set MGRClass = New ManagerClass
                    MGRClass.Init UBound(Data, 2), key, .Range(HEADERROW)
                    dicMGR.Add key, MGRClass
                End If

                For y = 1 To UBound(Data, 2)
                    dicMGR(key).Add Data(x, y)
                Next
            Next
        Next
    End With

    Application.Calculation = xlCalculationAutomatic
    Application.ScreenUpdating = False
End Sub
  • that looks pretty neat! – Martin Dreher Nov 25 '16 at 09:12
  • @MartinDreher If you're here is the download link:[Split TAbs.xlsm](https://www.dropbox.com/s/lnxc4g6j2i3kipo/Split%20to%20tabs.xlsm?dl=0). I cleared out the test data to reduce file size. The mod named `UnitTests` has a procedure `FillRawData` that will add the test data, as will as a couple of subroutines to run the test. `TestParseData` is the relevant one. –  Nov 25 '16 at 12:32
  • corporate environment, can't access dropbox - but I'll make sure to look into it after work. Kudos. – Martin Dreher Nov 25 '16 at 12:47
  • @Thomas Inzina I'm getting a "compile error: User-defined type no defined" when trying to run this, is there a reference I need to add to get this to run or could it be another issue? – Ashley Thomson Nov 28 '16 at 13:20
  • You need to insert a new class, rename it to ManagerClass and insert the ManagerClass code into it. You could download and open SplitTabs.xlsm from my comment above then drag the ManagerClass class into your workbook –  Nov 28 '16 at 13:36
  • ah of course, that's my bad. Im just getting an error on ' wsMGR.Range(m_Header.Address) = m_Header' in the class now. Run-time error 1004 - Application-defined or object-defined error. sorry to be a pain, new to vba – Ashley Thomson Nov 28 '16 at 13:48
  • @Thomas Inzina I would download your doc but Dropbox is blocked here – Ashley Thomson Nov 28 '16 at 14:02
  • Can you link me your workbook? Is Google Docs blocked? –  Nov 28 '16 at 14:18
  • yep unfortunately, quite strict on what we can access on the corporate machines – Ashley Thomson Nov 28 '16 at 14:20
  • @ThomasInzina ah found the problem, it was a strange format in the right hand column of the export. Vba works and is splitting it correctly. but only about 40 items into each tab? thanks very much for this by that way, should have said soon – Ashley Thomson Nov 28 '16 at 14:55
  • I'm not sure why it would be limited. My test stub had 27 different names with 25,000+ records each. –  Nov 28 '16 at 15:57
  • @ThomasInzina Now at home - downloaded the doc using your link, it gives a type missmatch error on 'If Evaluate("=ISREF('" & m_Manager & "'!A1)") Then'. and is similirarly only copying around 40-70rows into each tab, and missing the top row of headers. know the fix to this? would be much appreciated – Ashley Thomson Nov 28 '16 at 17:17