1

I have the next problem, I need to compare an excel file with 127,000 rows approximately with a text file with 1'500,000 lines approximately. Every line in the text file is divided by eight pipes, having 8 columns in excel. I need to do a VLookUp with the fifth column in the text line with certain column in my excel file.

What I've been doing is:

  1. I take the first row in my excel file, I look for it in the text file and I import only that line, then I fit the line in their respective columns in excel and at the end I do the VLookUp.

Sub test()
Dim textline As String, rw As Long, lRow As Long, rw1 As Long
Dim sPath As String, text As Variant

rw1 = 1
sPath = "My_path"

With ActiveWorkbook.Sheets("Sheet1")
   For rw = 2 To .Cells(Rows.Count, 1).End(xlUp).Row
       text = range("A" & rw).Value2
       Open sPath For Input As #1
       Do Until EOF(1)
           Line Input #1, textline
             If InStr(textline, CStr(text)) Then
               Sheets(2).Cells(rw1, 1) = textline
               Sheets(2).range("A" & rw1).TextToColumns Destination:=range("A" & rw1 & ":H" & rw1), DataType:=xlDelimited, _
                   TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
                   Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar _
                   :="|", TrailingMinusNumbers:=True
               rw1 = rw1 + 1
               Exit Do
           End If
       Loop
       Close #1
   Next rw
End With
End Sub
  1. I import all the text file to excel, then I fit the line in their respective columns and at the end I do the VLookUp.

Sub test()
Dim textline As String, rw As Long, lRow As Long, rw1 As Long
Dim sPath As String, text As Variant, sh As String, c As Integer

sh = "Sheet"
c = 1
rw = 1
rw1 = 1
sPath = "My_path"


With ActiveWorkbook.Sheets("Sheet1")
       Open sPath For Input As #1
       Do Until EOF(1)
           Line Input #1, textline
             Sheets("Sheet1").range("A" & rw) = textline
               Exit Do
             End If
             rw = rw + 1
       Loop
       Close #1
End With

With ActiveWorkbook.Sheets(1)
   For rw1 = 1 To .Cells(Rows.Count, 1).End(xlUp).Row
       Sheets("Sheet1").range("A" & rw1).TextToColumns Destination:=range("A" & rw1 & ":H" & rw1), DataType:=xlDelimited, _
           TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
           Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar _
           :="|", TrailingMinusNumbers:=True
   Next rw1
End With
End Sub
Now

The problem is that the two methods takes too long, Does anyone know a better solution that can help me with this?

Thank you in advice

Alex K.
  • 171,639
  • 30
  • 264
  • 288
  • 2
    The [Performance](https://fastexcel.wordpress.com/2011/10/26/match-vs-find-vs-variant-array-vba-performance-shootout/) gets better with [Variant arrays](https://stackoverflow.com/questions/23868137/which-the-fastest-way-to-sum-two-range/23869738#23869738) rather than Ranges. – danieltakeshi Sep 19 '17 at 17:09
  • Would likely be faster to use `Split(textline, "|")` to get an array of values than to use TextToColumns. Other improvements might depend on (eg) whether the field being matched on is unique on your worksheet and/or in your text file – Tim Williams Sep 19 '17 at 18:53
  • Good idea Tim, thanks. I guess instead of the array I will use a matrix using the same concept of the variant array. Best Regards – Ricardo Aguilar Sep 19 '17 at 20:20

1 Answers1

0

Since you're over the normal row limit, I'd recommend using Get & Transform (for Excel 2016) or Power Query (for Excel 2010 and 2013) if you're set on using Excel.

Daniel McCracken
  • 484
  • 1
  • 5
  • 11
  • 1
    The problem with that is that I have to use Excel 2007, Is what the company uses – Ricardo Aguilar Sep 19 '17 at 17:10
  • You'd get a decent performance boost from putting all of your worksheet data into an array first, then iterating through that. Then you're only reading from the worksheet once instead of 100k+ times. But it's still going to be slow; Excel 2007 really isn't meant to handle more than 1 million rows of data. – Daniel McCracken Sep 19 '17 at 17:23
  • Thanks Daniel, I will try it. I only hope my RAM can handle it. – Ricardo Aguilar Sep 19 '17 at 17:33