1

I have 100+ Million domains in my MySQL db and i want to check each domain if it is active or not. i did it using PHP CURL, i execute the script and if the response returns i mark the domain as active and if it doesn't return the response i mark it inactive.

here is the code:

function isDomainAvailable(){
    $domain = "http://www.".$this->input->post('url');
    $id = $this->input->post('id');

    $this->load->model('urls');
    if(!filter_var($domain, FILTER_VALIDATE_URL)){
           $this->urls->setActive($id,0);
           errorMessage('Invalid url');
    }

    $curlInit = curl_init($domain);
    curl_setopt($curlInit,CURLOPT_CONNECTTIMEOUT,10);
    curl_setopt($curlInit,CURLOPT_HEADER,true);
    curl_setopt($curlInit,CURLOPT_NOBODY,true);
    curl_setopt($curlInit,CURLOPT_RETURNTRANSFER,true);

    $response = curl_exec($curlInit);
    curl_close($curlInit);
    if ($response){
        $this->urls->setActive($id,1);
        successMessage('url is active');
    }
    $this->urls->setActive($id,0);
    errorMessage('url is not active');
}

and i am running this script with a synchronous AJAX calls so that it keep updating the UI to make sure the script is running and which record it is currently at:

here is the JQuery code:

$(document).ready(function(){
$("#startCheck").click(function(){
    pages = 2313744;
    for(i=1;i<=pages;i++){
        url = "<?php echo base_url() ?>main/getDomains/"+i+"/1";

        $.ajax({
            type: "POST",
            url:url,
            data: $("#form").serialize(),
            dataType: 'json',
            timeout:30000,
            async: false,
            success: function (data) {
                if(data.data){
                    $(".recordstable").find("tr:gt(0)").remove();
                    $.each(data.data, function(i,v) {
                        row = '<tr>';
                        row += '<td>'+v.id+'</td>';
                        row += '<td>'+v.url+'</td>';
                        row += '<td class="active'+v.id+'">';
                        row += '<img src="<?php echo base_url(); ?>images/loading.gif" width="16" height="16" alt="Loading"></td>';
                        row += '</tr>';
                        $('.recordstable tr:last').after(row);


                                $.ajax({
                                    type: "POST",
                                    url:'<?php echo base_url() ?>main/isDomainAvailable',
                                    data: {url:v.url,id:v.id},
                                    dataType: 'json',
                                    timeout:10000,
                                    async: false,
                                    success: function (data) {
                                        if(data.success){
                                            $('.active'+v.id).html('<span class="label label-success">Yes</span>');
                                        }else{
                                            $('.active'+v.id).html('<span class="label label-danger">No</span>');
                                        }
                                    },
                                    error:function(data){
                                        $('.active'+v.id).html('<span class="label label-warning">Failed</span>');
                                        //alert("something went wrong, please try again.");
                                        $(".loader").hide();
                                    }
                                });





                    });
                }
            },
            error:function(data){
                alert("something went wrong, please try again.");
                $(".loader").hide();
            }
        });
        //break;
    }
    return false;
});

});

first the script is running a for loop on urls, each request on url returns 50 domains and when the data returns another AJAX call send request to the script i have posted above which test the domain and return response in success or failure and then the script update the UI to show it is done checking, here is the screen shot of the webpage testing the domains

enter image description here

the problem with this script is that is running very slow as it tested only 4000 domains in last 10hours i need a fastest way to test the domains PHP isnt required for this. if there is any solution available for Python please share

Wasif Khalil
  • 2,217
  • 9
  • 33
  • 58

1 Answers1

0

Why JavaScript and AJAX? Do it all in PHP.

Instead of if (!$response) use

if (curl_errno($curlInit)){
  errorMessage(curl_error($curlInit));
}
Misunderstood
  • 5,534
  • 1
  • 18
  • 25
  • i used javascript just to keep the UI updated that the test is running and to see which domains it is currently testing – Wasif Khalil Jan 16 '15 at 05:45
  • You can do that by using ob_start and ob_flush. You could write results status to a text file then periodically look at the the text file. – Misunderstood Jan 16 '15 at 05:53