2

as title mentioned, is there a quick way to do that? I dont need a solid solution, anything that can differentiate, for example:

http://asdasd/

is not a valid domain name, where

http://asd.asdasd.asd

is a valid domain name.

I tried to search the solution, the closest(simple) solution is this: in python

But thats for python, I need to do in c++. Any help?

Can it be done by using "string manipulation" only? Like, substring?

Community
  • 1
  • 1
akow
  • 21
  • 1
  • 2
  • 3
    Regular expressions? As Raymond Chen, would say, "now you have two problems" – n8wrl Aug 17 '11 at 18:43
  • hmmm.. Other than regular expressions? I saw a book from library, very thick, just for regular expression. I will study it in future, but not now, i just beginning c++ 2 days ago – akow Aug 17 '11 at 18:48
  • Well, if you want to really be correct, look up the rfc and implemnent a parser for the grammar mentioned there – PlasmaHH Aug 17 '11 at 18:48
  • Lexer (parser) and state machine? – Thomas Matthews Aug 17 '11 at 18:49
  • 7
    Quibble: `http://` is not part of the domain name; the domain name is `"asd.asdasd.asd"`. Bigger quibble: `"asdasd"` *is* a valid domain name. If you're on `foo.example.com`, and there's a host named `asdasd.example.com`, depending on the configuration, you can probably refer to it as just `"asdasd"`. To be clear, you consider `asd.asdasd` to be "valid" even though there's no actual `asd` TLD (top level domain), right? – Keith Thompson Aug 17 '11 at 18:50
  • @akow: The thickness of books is not a reliable indicator of how difficult a topic it is. You can write arbitrarily thick books about _anything_, from how to use a pencil sharpener to how how to mow a lawn. The only difference is that thick books about regexes sell well enough to get published, and _that_ is purely a measure of how _afraid_ the book-buying public are of regexes. Which is completely without reason, as the basic concepts can equally well be covered in 4-5 pages of lecture notes. – hmakholm left over Monica Aug 17 '11 at 18:59
  • You're trying to validate *domain* names or *host* names? – Ben Voigt Aug 17 '11 at 19:12
  • @Keith: I wish I could downvote comments: In `http://asd.asdasd.asd/`, the part `asd.asdasd.asd` is NOT a domain name, it's a hostname. Hostnames can be interpreted relatively, as in your `asdasd.example.com` example, domain names can't. And with the introduction of arbitrary TLDs, `http://asdasd/` could refer to a fully-qualified name rather than a relative one. – Ben Voigt Aug 17 '11 at 19:18
  • @Ben: Thanks for the info. (I suspect the OP really wants to know about hostnames, and is incorrectly assuming that `http://asdasd` is an invalid URL, but that's a guess on my part.) My comment was based on what I thought the OP was actually asking about; I was sloppy with the terminology. – Keith Thompson Aug 17 '11 at 19:33

2 Answers2

0

I believe this can be done with libcurl.

genpfault
  • 51,148
  • 11
  • 85
  • 139
ks1322
  • 33,961
  • 14
  • 109
  • 164
0

Baring the fact that http://... is not a domain name but a URL, and that asdasd is as valid domain name if setup as a search domain (such as on local net), then purely checking for the string syntax can be done with a simple set of strncmp, strchr and strstr commands

char *str = "http://abd.xxx";

bool valid = strncmp(str,"http://",7) && str[7] && strchr(str+7,'.');

This should check that the string starts with http:// AND that there is more after the http:// and that the more after that contains a dot -- if you also want to handle where the URL contains an actual path like http://expample.com/mypath.txt, then the example become more complex, but you didn't specify if that was needed.

Alternatively, you can use regex and the pattern which you have from the python answer you point to yourself

Soren
  • 14,402
  • 4
  • 41
  • 67
  • It could be a valid local hostname, but it's also a potentially valid FQDN since the introduction of arbitrary TLDs. – Ben Voigt Aug 17 '11 at 21:12
  • No disagreement here -- the only way to really find out if a name is valid is by doing a host lookup -- but that is NOT what the OP is asking for -- the question is just about matching the pattern. – Soren Aug 17 '11 at 21:45
  • I think that is the correct answer to the question: "Can it be done using string manipulation?" "No, a host lookup is required." – Ben Voigt Aug 17 '11 at 21:48
  • @Ben -- then you should put that in as your answer -- I may just vote for that :-) – Soren Aug 18 '11 at 03:17