0

I have a very old source (2008) that use a dump of pcre. It was working but now with Delphi 10.3 it doesn't. I would like to to it with system.RegularExpressions but i dont know how to do ?

I want to parse a HTML file to extract array, select array and loop on rows to find one particular an select in this row a particular col.

the old code is :

table := RegexMatchedExpression(page, '<table.*?>.*?</table>', 3);

  rows := TStringList.create;
  RegexAllMatchedSubExpression(rows, table, '<tr.*?>(.*?)</tr>');
  for r:= 0 to rows.count-1 do begin 
    cols := TStringlist.create;
    RegexAllMatchedSubExpression(cols, rows[r], '<td.*?>(.*?)</td>');

    for c := 1 to cols.count-1 do begin  
      if r=0 then begin
        Cequejeveux[c-1] := cols[c];

Perhaps it will be more 'light' to find the row directly. Its strat by <tr class="odd">

Can you help me please because i can't find a system.regularexpression tutorial

rooky06
  • 31
  • 5
  • 2
    In general [parsing HTML using RegEx is a bad idea](https://stackoverflow.com/a/1732454/511529), except maybe in specific cases where the structure of the document is well known. For this question specifically, it's hard to help you fix your code, since your question is lacking an example of that HTML, and a concrete description of the behavior your get (like, compiler errors, runtime errors or just an incorrect match). Reading again, it doesn't even contain the code that doesn't work. – GolezTrol Oct 17 '19 at 11:46
  • 1
    Btw, maybe not a tutorial in the strict sense, but there is plenty of reference material, including [this explanation of System.RegularExpressions, written by the author of the pcre library](https://www.regular-expressions.info/delphi.html). – GolezTrol Oct 17 '19 at 11:50
  • I wonder what you mean by `.*?`, though. `.*` already matches 0 or more of any character. What is the `?` doing after that? – GolezTrol Oct 17 '19 at 11:51
  • I did not explain myself, sorry, as my English is not good I do with the words I know but here I use google and explain correctly. My pb is that the example presented was perfectly functional but since Delphi XE, the unit that I used, a dump of PCRE does not work anymore so I want to use the unit present in delphi. My goal is to parse an HTML table, to extract the lines and in each line to extract the columns. For the?, All the examples I find use it. Thanks for the link I will read this carefully – rooky06 Oct 17 '19 at 12:43
  • 1
    @GolezTrol The ? in .*? makes the match non-greedy so it will stop as soon as it consumes enough characters to match the regex. See https://stackoverflow.com/a/3075150/212016 – Keith Miller Oct 17 '19 at 21:19
  • I hope the link helps you, @rooky06. If so, please post your solution as an answer for future visitors. Otherwise, make sure to at least update the question: add the new code, the one that's not working, and a description of _how_ it is not working (compilation, specific runtime error, specific incorrect behavior). Without that, this question is incomplete and therefore off-topic. – GolezTrol Oct 18 '19 at 06:42

0 Answers0