0

I have two sheets.
Sheet1 is an imported XML table so it looks a little awkward, like a stairstep of tables with several thousand rows.
Sheet2 is a single column that is a list of strings with a few hundred entries.

I want to find every instance of each exact string in Sheet2 within Sheet1, then delete the whole row in Sheet1.

Both Sheet1 and Sheet2 data sources will grow and change over time, so I want to make a general solution.

I have a Python script that is supposed to trim out the XML source using what would be the contents of Sheet2, but the resulting XML is broken and unacceptable as it ends up missing data types.

I don't know if doing this via Excel will preserve the data types.

Assuming sheets are like arrays indexed by [rows,cols]. This is what I'm trying to do, with the functions standing in for whatever needs to be in VBA to achieve the same goal.

String str;
Int r, foundStrRow;

*/rowFind(sheet, string) being a function that finds arg string within sheet and returns the row index

Delete(sheet, row) being a function that deletes the index row from sheet*/

While(Sheet2[r,1]) {
   Str = Sheet2[r, 1];
   foundStrRow = rowFind(Sheet1,Str);
   Delete(Sheet1, foundStrRow);
   r++
}

The data cannot be shared for security reasons but I can provide a faux set of the data here.

https://i.stack.imgur.com/kQVGB.jpg

Imagine that Sheet1 has that stairstep that is about 20000 rows and 200 columns. Sheet2 is 400 rows and one column.

This is what I have come up with. It is able to find and delete a whole row if the string from Sheet2 exists in Sheet1.
I have to figure out how to make it conditional on the string existing. If the string from Sheet2 isn't in Sheet1 then I need to skip to the next entry.

Sub Trimmer()
Dim Rows2 as Integer
Dim wordToSearch as String

'Sets Rows2 to be limit of the loop
With Sheet2
    With .Cells.SpecialCells(xlCellTypeLastCell)
        Rows2 = .Row
    End With
End With

For r2=1 to Rows2
    Sheets("Sheet2").Select
    wordToSearch = Cells(r2,1).Value
    Sheets("Sheet1").Select
    Cells.Find(What := wordToSearch, After := ActiveCell, LookIn := xlFormulas2, _
      LookAt :=xlWhole, SearchOrder := xlByRows, SearchDirection:= xlNext, _
      MatchCase := False, SearchFormat := False).Activate
    Rows(ActiveCell.Row).Select
    Application.CutCopyMode = False
    Selection.Delete Shift:=xlUp

This will iterate and take out the whole rows where the match to the list in Sheet2 exists.
I need to have it skip when something in the Sheet2 list isn't found in Sheet1.

Community
  • 1
  • 1
  • Unfortunately converting code from one language to another is considered out of scope for Stack Overflow. Maybe make an attempt with VBA yourself and ask questions about it instead? – braX Jun 22 '22 at 04:08
  • 1
    Welcome to the wonderful world of VBA :-) I like your approach: writing the pseudocode in pseudo-Python and try to convert it in VBA without knowing the commands. Are you aware that you can record macros? Like that, you can do the actions yourself and see the corresponding VBA commands and functions. There's one trick: while recording macros, Excel has the tendency to use over-elaborated parameter lists and the amount of obsolete `SELECT` statement might be quite large :-). But this gives you already a head start. – Dominique Jun 22 '22 at 06:28
  • For the `SELECT` thing, this might be interesting reading material: https://stackoverflow.com/questions/10714251/how-to-avoid-using-select-in-excel-vba – Dominique Jun 22 '22 at 06:29
  • Could you share the screenshots of the worksheets (especially `Sheet1`), possibly some sample data? You can [edit your post](https://stackoverflow.com/posts/72709417/edit) at any time. – VBasic2008 Jun 22 '22 at 08:07

1 Answers1

0

What I would do is create two arrays to hold sheets 1 and 2. Then loop array 2, do an inner loop for array 1, and look for matches.

Sub MatchLinesFromSheets()
    Dim ArrSheet1(), ArrSheet2()
    Dim Rows1 As Long, r1 As Long, Cols1 As Long, c1 As Long
    Dim Rows2 As Long, r2 As Long
    Dim PasteRow As Long
    
    'Load Sheet1 into array
    With Sheet1
        With .Cells.SpecialCells(xlCellTypeLastCell)
            Rows1 = .Row
            Cols1 = .Column
        End With
        ArrSheet1 = .Range(.Cells(1, 1), .Cells(Rows1, Cols1 + 1)).Value
    End With
    
    'Load Sheet2 into array
    With Sheet2
        With .Cells.SpecialCells(xlCellTypeLastCell)
            Rows2 = .Row
        End With
        ArrSheet2 = .Range(.Cells(1, 1), .Cells(Rows2, 1)).Value
    End With
    
    'Step through each row of Sheet2
    For r2 = 1 To Rows2
        
        'Step through each row and column of Sheet1
        For r1 = 1 To Rows1
        For c1 = 1 To Cols1
            If ArrSheet2(r2, 1) = ArrSheet1(r1, c1) Then
                ArrSheet1(r1, c1) = ""
                ArrSheet1(r1, Cols1 + 1) = "x"
                Exit For
            End If
        Next
        Next
    Next
    
    'Remove blank rows
    For r1 = 1 To Rows1
        If ArrSheet1(r1, Cols1 + 1) <> "x" Then
            PasteRow = PasteRow + 1
            For c1 = 1 To Cols1
                ArrSheet1(PasteRow, c1) = ArrSheet1(r1, c1)
                ArrSheet1(r1, c1) = ""
            Next
        End If
    Next
    
    With Sheets.Add
        .Range(.Cells(1, 1), .Cells(Rows1, Cols1)) = ArrSheet1
    End With
End Sub
  • Thank you for the write up. Learning that I can load sheets as arrays is very helpful. However the statement ArrSheet1(r1, c1) = "" would make the cell blank. What I need is the complete deletion of the row where the match occurs. The match itself is not what needs to be removed, but all the relevant data within that row if that makes sense. I think this could be a good start, looking at this I can find the rows I need. Just need to delete them now. – Kawaii Leonard Jun 22 '22 at 17:53
  • Edited. Try it now. – John Williams Jun 23 '22 at 06:52