1

i would like to ask someone how is possible to extract some informations from HTML document on network folder. and get them to memo, or text file.

the HTML file contains this informations:

<div>
<a name="N02CB3FB0N02CB7834"></a>
<table width="600" class="data">
<tr>
<td width="200" valign="middle" style="background-color:white;" rowspan="11"><a href="file://///SRVTOPS/TRUMPF_PDM2/DATA_LASER/Pellenc/Cierny/04_0/108536B.GEO"><img width="200" alt="108536B.GEO" src="file://///SRVTOPS/TRUMPF_PDM2/DATA_ORDER/JOBM3046/JOBM3046_1_108536B_1.BMP"></a></td><td valign="bottom">C. d&iacute;lu:
                      </td><td valign="bottom">1&nbsp;
                      </td>
</tr>
<tr>
<td valign="bottom">Dil:
                      </td><td valign="bottom">108536B&nbsp;
                      </td>
</tr>
<tr>
<td valign="bottom">N&aacute;zev v&yacute;kresu:
                      </td><td valign="bottom">PORTEUR&nbsp;
                      </td>
</tr>
<tr>
<td valign="bottom">C&iacute;slo zak&aacute;zky:
                      </td><td valign="bottom">*</td>
</tr>
<tr>
<td valign="bottom">zakaznik</td><td valign="bottom">Pellenc</td>
</tr>
<tr>
<td valign="bottom">ks:
                      </td><td valign="bottom">9</td>
</tr>
<tr>
<td valign="bottom">Rozmer:
                      </td><td valign="bottom">850.0 mm
                         &nbsp;x
                         31.2 mm</td>
</tr>
<tr>
<td valign="bottom">Plocha:
                      </td><td valign="bottom">26107.27 mm2</td>
</tr>
<tr>
<td valign="bottom">Hmotnost:
                      </td><td valign="bottom">0.820 kg
                            </td>
</tr>
<tr>
<td valign="bottom">Trvani:
                      </td><td valign="bottom">00:
                00:
                31</td>
</tr>
<tr>
<td valign="bottom">Soubor dilu:
                      </td><td valign="bottom"><a href="file://///SRVTOPS/TRUMPF_PDM2/DATA_LASER/Pellenc/Cierny/04_0/108536B.GEO">108536B.GEO</a></td>
</tr>
</table>
</div>
<div>
<a name="N02CB3FB0N02CB7FA8"></a>
<table width="600" class="data">
<tr>
<td width="200" valign="middle" style="background-color:white;" rowspan="11"><a href="file://///SRVTOPS/TRUMPF_PDM2/DATA_LASER/VYTSTUHA_8KS.GEO"><img width="200" alt="VYTSTUHA_8KS.GEO" src="file://///SRVTOPS/TRUMPF_PDM2/DATA_ORDER/JOBM3046/JOBM3046_1_VYTSTUHA_8KS_2.BMP"></a></td><td valign="bottom">C. d&iacute;lu:
                      </td><td valign="bottom">2&nbsp;
                      </td>
</tr>
<tr>
<td valign="bottom">Dil:
                      </td><td valign="bottom">NOID_2&nbsp;
                      </td>
</tr>
<tr>
<td valign="bottom">N&aacute;zev v&yacute;kresu:
                      </td><td valign="bottom">&nbsp;
                      </td>
</tr>
<tr>
<td valign="bottom">C&iacute;slo zak&aacute;zky:
                      </td><td valign="bottom">*</td>
</tr>
<tr>
<td valign="bottom">zakaznik</td><td valign="bottom"></td>
</tr>
<tr>
<td valign="bottom">ks:
                      </td><td valign="bottom">8</td>
</tr>
<tr>
<td valign="bottom">Rozmer:
                      </td><td valign="bottom">140.0 mm
                         &nbsp;x
                         48.8 mm</td>
</tr>
<tr>
<td valign="bottom">Plocha:
                      </td><td valign="bottom">4271.95 mm2</td>
</tr>
<tr>
<td valign="bottom">Hmotnost:
                      </td><td valign="bottom">0.134 kg
                            </td>
</tr>
<tr>
<td valign="bottom">Trvani:
                      </td><td valign="bottom">00:
                00:
                07</td>
</tr>
<tr>
<td valign="bottom">Soubor dilu:
                      </td><td valign="bottom"><a href="file://///SRVTOPS/TRUMPF_PDM2/DATA_LASER/VYTSTUHA_8KS.GEO">VYTSTUHA_8KS.GEO</a></td>
</tr>
</table>
<hr>
</di>

so for ressult i need. 108536, VYTSTUHA_8KS

and for each file i need to find a BMP file and put it to on image.

many thanks

Hello, sorry , yes i know how to read HTML file, and put it to memo, bu after I need to find some info. here is the code to get html to memo :

begin
  if OpenDialog1.Execute then
  begin
    sHTMLFile := OpenDialog1.FileName;
    Strl := TStringList.Create;
    try
      Strl.LoadFromFile(sHTMLFile);
      Idoc := CreateComObject(Class_HTMLDOcument) as IHTMLDocument2;
      try
        IDoc.designMode := 'on';
        while IDoc.readyState <> 'complete' do
          Application.ProcessMessages;
        v := VarArrayCreate([0, 0], VarVariant);
        v[0] := Strl.Text;
        IDoc.Write(PSafeArray(System.TVarData(v).VArray));
        IDoc.designMode := 'off';
        while IDoc.readyState <> 'complete' do
          Application.ProcessMessages;
        Memo1.Lines.Text := IDoc.body.innerText;
      finally
        IDoc := nil;
      end;
    finally
      Strl.Free;
    end;
  end;
end; 
kot-da-vinci
  • 1,152
  • 16
  • 30
denn
  • 137
  • 1
  • 5
  • 14
  • What have you tried so far? If you are having a specific problem with a specific piece of code, ask about that. If you are just asking how to do something in general, but have not tried to solve the problem yourself yet, that is generally not the kind of question that belongs here. – Remy Lebeau Apr 01 '15 at 21:20

2 Answers2

1

Use TXMLParser from http://destructor.de to parse the HTML document and extract the information you need as part of the scanning process.

Steve F
  • 1,527
  • 1
  • 29
  • 55
1

I have done this successfully many times simply by loading the html into a string and using Pos to find the text that I am looking for.

For example to get the Dil value, call the function using a known "target" before the text you are looking for e.g. "Dil:"

<td valign="bottom">Dil:
        </td><td valign="bottom">108536B&nbsp;
                      </td>

Call this function with 'Dil:' as the ATag parameter, and HTML as the html..

function GetTagValue(HTML, ATag: String): String;
var P: Integer;
  S: String;
begin
  S := HTML;
  P := Pos(ATag, S);
  HTML := Copy(S, P, 99999);
  P := Pos('bottom">', S);
  HTML := Copy(S, P + 8, 99999);
  P := Pos('&nbsp', S);

  Result := Copy(S, 1, P-1);
end;

I just typed this as an example - you may want to test it first!!!

penarthur66
  • 311
  • 2
  • 8
  • 1
    http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – David Heffernan Apr 03 '15 at 20:42
  • Hello thank you , but I don't realy know how to use this function. procedure TForm1.Button2Click(Sender: TObject); begin memo2.Lines.Text := GetTagValue('F:\IFS\Ciarove_kody\Tlac_job\JOBM3046.HTML','Dil'); end; is it correct? many thanks – denn Apr 08 '15 at 20:07
  • More like: TForm1.Button2Click(Sender: TObject); var S: String; begin memo2.Lines.LoadFromFile('F:\IFS\Ciarove_kody\Tlac_job\JOBM3046.HTML'); S := GetTagValue(Memo2.Text,'Dil'); end; – penarthur66 Apr 09 '15 at 08:36