0

C++ Sockets. Getting information from a website.

I am trying to read content from web using sockets. using the following code.

int status = getaddrinfo(l_url.c_str(), "http", &l_address, &l_addr_ll);
if (status != 0 ){
    printf("\n ***** getaddrinfo() failed: %s\n", gai_strerror(status));

    return FAILURE;
}

The code works fine for urls like "www.yahoo.com", "www.google.com" however it doesnt work for url's like "www.google.com/nexus".

Any URL's having a "/" are not working with this code. Am i missing anything?

kris123456
  • 501
  • 1
  • 5
  • 15

3 Answers3

3

getaddrinfo gives you information about network addresses, not about URLs. A URL is not a network address, though it often contains one. A string like "www.google.com/nexus" is neither a URL nor an address (though it might well be part of a URL), so its not suprising that getaddrinfo fails for it.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • What a prompt Response CHris !! Kudos. – kris123456 Oct 14 '13 at 05:46
  • Any advice on how this could be resolved chris ? I need data from certain websites. as informed, i would be accessing url's like "google.com/nexus", "apple.com/imac", etc. what changes needs to be done in my code? – kris123456 Oct 14 '13 at 05:48
  • @kris123456: easiest solution would be to use a URL library, such as [libcurl](http://curl.haxx.se/libcurl/), rather than getaddrinfo – Chris Dodd Mar 24 '15 at 15:52
1

The man page says that the first parameter is supposed to be a host name. The host name is just the first part up to the top level domain. Everything thereafter does not belong to the host name. Take care, some parts before may also not belong to the hostname, especially if you see an @ in your URL.

Have a look into wikipedia for URL, there is a lengthy explanation which part of a URL actually is the host name you can put into your function.

nvoigt
  • 75,013
  • 26
  • 93
  • 142
  • There are different scenarios and solutions [here](http://stackoverflow.com/questions/2616011/easy-way-to-parse-a-url-in-c-cross-platform). – nvoigt Oct 14 '13 at 07:48
0

As per the man page. one needs to pass the URL information to getaddressinfo method. FOr this, the user must pass the name of the website. like "www.google.com" However while requesting for data, the user posts a request, at that point, the user could post URL like "www.google.com/nexus"

  • The address will be same for a URL. however the request varies, hence one needs to get the address of the website using only till ".com". Once address info is received, further requests could be made accordingly.
kris123456
  • 501
  • 1
  • 5
  • 15