0

Some googlefoo lead me to this answer, however after mucking around with it and reading the docs I can't figure out how to actually build a QWebFrame in order to parse.

I will need to do something a fair bit more elaborate than this later, but right now all I'm trying to do is post some data, loginusername and password, to a website and parse the title tag on the response page to determine whether the login was a success of failure. I feel like it might be quicker to do that with regex rather than building a whole dom, but I don't know regex and this seems easier atm.

So, what I've got going on now is I post the data and the reply gets turned over to a method of a subclassed QDialog when the request emits the finished() signal. So I've got a QNetworkReply which I'm trying to parse and don't know where to go from there. If you need to see my code please ask, but I figured it was unnecessary. Thanks guys.

Community
  • 1
  • 1
kryptobs2000
  • 3,289
  • 3
  • 27
  • 30

1 Answers1

0

Don't parse HTML with regex!

Community
  • 1
  • 1
karlphillip
  • 92,053
  • 36
  • 243
  • 426
  • I know you normally should not, however my thought is the best way to do this is to start reading the data before it's finished downloading, all I need is the title which could be a one pass regex expression and once it's extracted it it could just abort the request. That's got to be a lot better way than downloading the whole page, parsing it into a domtree and then extracting the title text. Still, the performance difference is hardly noticible, and I don't know regex, so I'm going with the later route anyways, especially since I _will_ need it later on. – kryptobs2000 Feb 21 '11 at 18:20
  • @kryptobs2000 if you search for .* that could work. But imagine if someone put in a comment something like test, you're screwed. It could also be in a script. So it'll work, until it won't. – anno Feb 22 '11 at 00:05