I´m trying to make an application in Excel vba, but I´ve some problems. I need my Excel application to download some files that are in a zip format. I´ve already done this part of the problem, my app can download and unzip the files. Next I´ve to read the whole file in .htm extension and get some information from it. It is required to work like this that when app is opened, program should look for the last number of contest which is “concurso” in Basil then look for the same number in .htm file and start copying the next data.
I´ve already discovered a pattern to read the file and get the data I want, but I don’t know how to code for it. The pattern in .htm file to be extracted is inside tags td, a text that have 2 slash, therefore, I´ve a date, at this time, I´ve to do 3 things, get the date, the line above the date I´ve the number of concurso, so I need to get it too, and the 15 lines beneath the date I´ve 15 numbers that I need them too. This pattern doesn’t change and have to be processed till the end of .htm file. and transfer these data to my sheet to be manipulated later.
I shall provide further clarifications in case of some doubts about the problem.
This is the code that I'm using to download and unzip the files.↓
Sub DownloadEUnzip()
Dim FSO, oApp As Object
Dim objHttp, DefPath, Arquivo As String
Dim Dados() As Byte
Dim Fname As Variant
Dim FileNameFolder As Variant
Dim iFileNumber As Long
Dim diretorio As String
diretorio = Dir("c:\lotofacil\D_LOTFAC.HTM")
If diretorio = "D_LOTFAC.HTM" Then
Kill "C:\lotofacil\*"
End If
Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
objHttp.Open "GET", "http://www1.caixa.gov.br/loterias/_arquivos/loterias/D_lotfac.zip", False
objHttp.Send
DefPath = "C:\lotofacil\" '<<< Altere aqui
Arquivo = DefPath & "D_lotfac.zip"
If objHttp.Status = "200" Then
Dados = objHttp.ResponseBody
iFileNumber = FreeFile
Open Arquivo For Binary Access Write As #iFileNumber
Put #iFileNumber, 1, Dados
Close #iFileNumber
End If
If Right(DefPath, 1) <> "\" Then
DefPath = DefPath & "\"
End If
FileNameFolder = DefPath
Set oApp = CreateObject("Shell.Application")
oApp.Namespace(FileNameFolder).CopyHere oApp.Namespace("C:\lotofacil\D_lotfac.zip").items
MsgBox "Arquivos baixados e descompactados com sucesso!"
End Sub
HEREOne can download the file for visualization of the problem.
ps A folder called lotofacil is to be created on C: drive for spreadsheet proper working.
UPDATE 1
code to find a date
If Mid(dataline, 19, 1) = "/" And Mid(dataline, 22, 1) = "/" Then
Debug.Print dataline
End If
UPDATE 2
so caio, its real fast now, but while i´m using i noticed that the program was taking a column less than a need, and i change the code and it works Apparently..would like u take a look to see if i didnt any mess... i change the size of array, and look like it work :) take a look.
Sub ReadLines()
Dim dataArray() As String
Dim strText
Dim result As String
Dim regExDate As New RegExp, regExAnyContent As New RegExp
Dim matches As MatchCollection
Dim match As match
Dim previous As String, current As String
Dim currentLine As Integer
ReDim dataArray(17, 1000)
regExDate.Pattern = "(\d{2}/\d{2}/\d{4})"
regExAnyContent.Pattern = "<td[^>]*>([^<]*)"
dirPath = "c:\lotofacil\"
filePath = dirPath & "D_LOTFAC.HTM"
result = ""
currentLine = 0
If Not Dir(filePath) = "D_LOTFAC.HTM" Then Exit Sub
FileNum = FreeFile()
Open filePath For Input As #FileNum
previous = ""
While Not EOF(FileNum)
Line Input #FileNum, current ' read in data 1 line at a time
If Len(current) > 0 Then
Set matches = regExDate.Execute(current)
If matches.Count > 0 Then
dataArray(1, currentLine) = matches.Item(0)
dataArray(0, currentLine) = regExAnyContent.Execute(previous).Item(0).SubMatches(0)
For i = 1 To 16
Line Input #FileNum, current
While current = ""
Line Input #FileNum, current
Wend
dataArray(1 + i, currentLine) = regExAnyContent.Execute(current).Item(0).SubMatches(0)
Next
currentLine = currentLine + 1
If currentLine Mod 1000 = 0 Then
ReDim Preserve dataArray(17, currentLine + 1000)
End If
End If
previous = current
End If
' decide what to do with dataline,
' depending on what processing you need to do for each case
Wend
Range(Cells(1, 1), Cells(currentLine, 17)) = Application.Transpose(dataArray)
End Sub
but still happenin a thing really strange, in the sheet that the data is putted, the dates are wrong, i needed them in the format dd/mm/yyyy, and i know i'm already bugging u, but if is too difficult to make this change u could just take off this column of the date ? please...
and first of all thank you very much, u´re really good on excel ;)