1

I am making a macro which imports data from pdf to excel. From everything I have pasted I only need the data from a table containing 50 rows and 7 columns. Each row is imported as a string of numbers separated by space as seen below: I need to delete the rows where the first part of the string is not a number between 1 and 50 (the 50 varies but is given as an input by the user). I have tried to set up a loop as shown - but it is getting to complex for me to figure out so the below definitely will not work - it is just to show my thought process. Furthermore is there a way to cnvert the data in the lines to numbers instead of text?

    Dim A As Integer
    Dim B As Integer
    Dim C As Integer

    Dim MyString() As String

    A = 1
    Do While Not IsEmpty(Cells(A + DataStart - 1, 1)) 'DataStart is the row 
                                                       where data starts
    MyString() = Split(Cells(A + DataStart - 1, 1))
        C = 1
        Do Until C = 50
            If MyString(0) = C Then
                For B = 0 To UBound(MyString)
                    Cells(A, B + 1) = MyString(B)
                Next B
            Else
                ActiveSheet.Cells(A, 1).Select
                ActiveCell.EntireRow.Delete
            End If
        Next C                                
    Loop

Data Example:

44 210,21 22,55 210,21 22,553 196,505 OK        
45 227,59 25,28 226,02 25,612 197,529 OK        
46 228,58 25,31 228,58 25,310 197,827 OK        
2019.06.06. 16:37:28 M94_2019.06.06._17471_Fólia teszt_Felsőparaméter_CB.is_tens        
M94_2019.06.06._17471_Fólia teszt_Felsőparaméter_CB.is_tens 3 oldal a 4-ból/ből     
Max.        
Load        
(N)     
Extension       
at Max.     
(mm)        
Load at     
break       
(N)     
Extension       
at break        
(mm)        
Terhelés 20mm-nél       
(N)     
Note to     
sample      
47 213,54 24,07 200,82 24,410 192,925 OK        
48 234,06 26,23 234,06 26,231 198,417 OK        
49 227,20 25,32 227,20 25,322 197,384 OK        
50 211,45 25,30 211,45 25,300 192,622 OK
Paulo Campez
  • 702
  • 1
  • 8
  • 24
La82
  • 37
  • 4

3 Answers3

0

I would approach this differently. Going back and forth to the spreadsheet, deleting rows, costs a lot of time.

I would

  • read the column into a VBA variant array for speedy processing
  • go through the array collecting the rows you wish to keep
  • write these back to the worksheet
    • either delete and then write in the original location or (which I prefer)
    • write the results elsewhere
  • split using the texttocolumns method

Not sure how you want to format the "split" row. If you leave it as general, and if the comma is your decimal separator, then they will be seen as numbers. If something else, you may need to set the fieldinfo parameter to text for each column.

Option Explicit
Sub terfuge()
    Dim rRes As Range, wsSrc As Worksheet, wsRes As Worksheet
    Dim vSrc As Variant, vRes As Variant, Col As Collection
    Dim I As Long

Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet1")
    Set rRes = wsRes.Cells(1, 2) 'or cells(1,1) if you want to overwrite

With wsSrc
    vSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp))
End With

Set Col = New Collection
For I = 1 To UBound(vSrc, 1)
    Select Case Split(vSrc(I, 1))(0)
        Case 1 To 50
            Col.Add vSrc(I, 1)
    End Select
Next I

ReDim vRes(1 To Col.Count, 1 To 1)
For I = 1 To Col.Count
    vRes(I, 1) = Col(I)
Next I

Set rRes = rRes.Resize(rowsize:=UBound(vRes, 1))
With rRes
    .EntireColumn.Clear
    .Value = vRes
    Application.DisplayAlerts = False 'avoid the "do you want to replace the data" alert
    .TextToColumns DataType:=xlDelimited, textqualifier:=xlTextQualifierDoubleQuote, _
        Tab:=False, semicolon:=False, comma:=False, Space:=True, other:=False
    Application.DisplayAlerts = True
    .CurrentRegion.EntireColumn.AutoFit
End With
End Sub

enter image description here

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
0

You're on the right track, but there are a few things to understand that will greatly help you in your code:

  1. Always use Option Explicit to ensure your variables are declared the way you want
  2. Avoid using Select and Activate

Here's an example that will work to get you started.

Option Explicit

Sub test()
    Const min As Long = 1
    Const max As Long = 50

    Dim dataRange As Range
    Set dataRange = Sheet1.UsedRange

    Dim topRow As Long
    Dim bottomRow As Long
    With dataRange
        topRow = .Rows(1).Row
        bottomRow = .Rows(.Rows.Count).Row
    End With

    Dim tokens As Variant
    Dim value As Variant
    Dim saveThisRow As Boolean
    Dim i As Long
    For i = bottomRow To topRow Step -1
        saveThisRow = False
        tokens = Split(dataRange.Cells(i, 1).value, " ")
        If IsArray(tokens) Then
            value = tokens(0)
            If IsNumeric(value) Then
                If value >= min And value <= max Then
                    saveThisRow = True
                End If
            End If
        End If
        If Not saveThisRow Then
            dataRange.Cells(i, 1).EntireRow.Delete
        End If
    Next i
End Sub
PeterT
  • 8,232
  • 1
  • 17
  • 38
0

Here's my take on it. Similar to PeterT's solution. When dealing with large files Ron is absolutely correct in how expensive deleting rows can be. So writing out the the formatted data to another sheet will prove to be quicker.

Sub ParseData()

    Dim lowBound As Integer
    Dim hiBound As Integer

    lowBound = 1
    hiBound = 50

    Dim currentWS As Worksheet
    Set currentWS = ThisWorkbook.Worksheets("Sheet1") '' Change this to the sheet your data is stored on

    Dim allData As Range
    '' Define where your data starts and ends, change this as needed
    Set allData = currentWS.Range("A1", currentWS.Range("A1").End(xlDown))

    Dim datRng As Range

    Dim sploded() As String

    '' Loop backwards on data since deleting will cause row skips if you do forwards
    For i = allData.Cells.Count To 1 Step -1

        Set datRng = allData.Cells(i, 1) 'Looking at a single cell

        sploded = Split(datRng.Value, " ") 'Space delimited to array

        If IsNumeric(sploded(0)) = True Then
            ' if the first number is within the bounds
            If CInt(sploded(0)) <= hiBound And CInt(sploded(0)) >= lowBound Then

                '' Overwrite with the data into cells
                For j = LBound(sploded) To UBound(sploded)
                    datRng.Offset(0, j).Value = sploded(j)
                Next j
            Else
                datRng.EntireRow.Delete '' Is number, but outside the bounds
            End If
        Else
            datRng.EntireRow.Delete '' Isn't a number
        End If
    Next i


End Sub
dasche
  • 111
  • 6