0

I'm building a simple C program, which takes a user input parameter (URL) using scanf(), as the code below reflects. I'm now looking for the best "standard" way to read/write the remote file to a local file... which I will then preform a grep (search) operation on the new local source file.

//CODE

#include <stdio.h>

int main(void){
   char url[255];

   //USER INPUT URL
   printf("ENTER URL: ");
   scanf("%s", &url);

   //GET FILE AT URL(REMOTE) AND COPY TO (LOCAL) 




   //RETURN
   return 0;
}
Jordan Davis
  • 1,485
  • 7
  • 21
  • 40

3 Answers3

0

You could use libcurl, or just shell out and use wget or curl, honestly.

char command[1024];
snprintf(command, 1024, "wget -c '%s'", url);
system(command); // add error handling

That will take considerably less effort.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Yea I took a look at it, works fine.. But there has be a standard way, without loading another 3rd party lib in – Jordan Davis Sep 10 '15 at 22:18
  • 2
    I don't think so. What makes you think that? Reading files from webservers is not - by any measure - a trivial task. Just that people do it all the time doesn't make it trivial. Highlevel programming languages exist for a reason. – sehe Sep 10 '15 at 22:19
  • Ahh goodshttt! Yea Ima just shell out the cmd, eating lunch right now I'll mark it correct when I get back and test it. Thank you, thank you! – Jordan Davis Sep 10 '15 at 22:25
  • Improve the world: think of security ramifications whenever you use external input in the commands issued :) Cheers – sehe Sep 10 '15 at 22:25
  • @JordanDavis - or, rather than write that program, just run wget or curl yourself. Or do it as a shell script, to save you from having to type the `-c` part. Or, if this is for a learning experience, I would use libcurl, as sehe suggests, to learn how to use that lib. Or use low level sockets yourself to learn them, but unless learning sockets is the goal it's the hardest way to get your remote file. – Stephen P Sep 10 '15 at 22:43
  • @StephenP [the `-c` part](https://www.gnu.org/software/wget/manual/wget.html#Download-Options) was a bit random (just for demonstration). It's not in any way related to running it as a child process – sehe Sep 10 '15 at 22:44
  • @StephenP yea I'm not looking to just run the cmd manually... Im entering a url and having it parsing the source code from that file pulling out the URLs for an MP4 video at different formats. Some 480p 720p 1080p etc... – Jordan Davis Sep 10 '15 at 22:50
  • @JordanDavis sounds very very much like a perl job (perhaps python/ruby/whatnot) – sehe Sep 10 '15 at 22:55
  • @sehe yea no I want it in C for speed... I know php, python, and ruby.... I'm trying to move away from interpretive languages though and their curtain of lies lol (they are good for development though). I'm not worried about security since it's a private program for my personal use for now. – Jordan Davis Sep 11 '15 at 13:30
  • I'm sorry to inform you but "I want to download a file in C for speed" is ... ridiculous. You need to measure your bottlenecks before optimizing. – sehe Sep 11 '15 at 13:36
0

You could use to create a fork() and execute a command like wget, curl with execl,execl,execlp and after you could use to read that file with fgets and add-line with fputs.

0

I think you are looking for some kind of html parser, there is a possibility to include curl lib:

#include <curl/curl.h>
#include <stdio.h>
#include <stdlib.h>

#define FILE_SIZE 10000

int main(void)
{
curl_global_init(CURL_GLOBAL_ALL);

CURL * myHandle;
CURLcode setop_result;
FILE *file;

if((file = fopen("webpage.html", "wb")) == NULL)
{
    perror("Error");
    exit(EXIT_FAILURE);
}

if((myHandle = curl_easy_init()) == NULL)
{
    perror("Error");
    exit(EXIT_FAILURE);
}

if((setop_result = curl_easy_setopt(myHandle, CURLOPT_URL, "http://cboard.cprogramming.com/")) != CURLE_OK)
{
    perror("Error");
    exit(EXIT_FAILURE);
}

    if((setop_result = curl_easy_setopt(myHandle, CURLOPT_WRITEDATA, file))     != CURLE_OK)
{
    perror("Error");
    exit(EXIT_FAILURE);
}

if((setop_result = curl_easy_perform(myHandle)) != 0)
{
    perror("Error");
    exit(EXIT_FAILURE);
}
curl_easy_cleanup(myHandle);
fclose(file);
puts("Webpage downloaded successfully to webpage.html");

return 0;
}

its the post #10 in the forum below

found on enter link description here

Cydyn
  • 11
  • 1
  • Yea I know all about libcurl, I was just looking for a good "standard" way of doing if there was one. I think just executing the "curl" command using `execl` is the only way... – Jordan Davis Sep 11 '15 at 14:10