0

I just want one suggestion or a way I can handle a situation

I am having a module in which I used to crawl a website and get the current movies running in nearby cinema halls. I have two tables for that 1) One is for movies and other is for cinemas reffering the movies which are inserted in movie table first.

For now I have set my file on every morning cron Job. So in my code I first delete all data in both the tables and insert the new data . But by this I usually loose all rating scores given by the end user fr that particula movie.

To overcome this situation , I thought of some solution

I created a new query

INSERT INTO jos_movie (movie_name, language, cast,movie_release,director,rating,rating_count,movie_ids)
SELECT * FROM (SELECT 'test','null','yahoo','Dec 21, 2012','himmat',250,230,'43677') AS tmp
WHERE NOT EXISTS (
    SELECT movie_name FROM jos_movie WHERE movie_name = 'test')

Similary I created the same approach for cinema table also.

By this it will check and will not overwrite the movie in the table . But there is some problme in this approach. If the cinema owner did removed the shows for that particular movie eg 'test' . Then by above query it wont delete that . It will remain there .

Sorry for my subject line as I am not able to think for good subject line for this issue.

So how can I achieve a result so that the existing movie wont get update if its in the table and gets deleted if its not in the crawl result array of my script.

Here is my table results

this is movie table result

enter image description here

This is cinema table

enter image description here

here is the code I am using for same.

$con=mysql_connect('localhost','test','test');
mysql_select_db('test',$con);

// Use cURL to get the RSS feed into a PHP string variable.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'myrsslink.xml');
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml = curl_exec($ch);
curl_close($ch);

$arrData = array();
// Create an array of item elements from the XML feed.
$news_items = element_set('item', $xml);
$del_movie = "delete from jos_movie";
mysql_query($del_movie);

$del_cinema = "delete from jos_cinema";
mysql_query($del_cinema);

foreach($news_items as $item) {
    $title = value_in('title', $item);
    $url = value_in('link', $item);
    $cast = value_in('description', $item);
    //curl_setopt($ch, CURLOPT_URL,$url);
    //curl_setopt($ch, CURLOPT_HEADER, false);
    //curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    //$html = curl_exec($ch);
    $arrTitle = explode('-',$title);
    $html = file_get_html($url);
    $htmlShowTime = '';

    // find all span tags with class=gb1 moviTimes moviTmngBox
    foreach($html->find('ul[style=line-height:2em;]') as $e)
        $htmlShowTime = $e->plaintext;

    $movie_name = $arrTitle[0];

    $apiKey = '30f44b6ef9472d414e50d2acaa058b60';
    $url = sprintf('http://api.themoviedb.org/2.1/Movie.search/en/xml/%s/"%s"',$apiKey,rawurlencode(trim($movie_name)));

    //$xml = simplexml_load_file("http://api.themoviedb.org/2.1/Movie.search/en/xml/accd3ddbbae37c0315fb5c8e19b815a5/"$movie_name"");
    $xml = simplexml_load_file($url);
    $movies = $xml->movies->movie;
   foreach ($movies as $movie){
        $arrMovie_id = $movie->id;
    }
    $arrStr = explode(':',$htmlShowTime);
    $release = substr($arrStr[3],0,strlen($arrStr[3])-8);
    $director = substr($arrStr[5],0,strlen($arrStr[5])-11); 

    $sql_movie = "insert into jos_movie(movie_name,language,cast,movie_release,director,rating,rating_count,movie_ids)values('$movie_name','null','$cast','$release','$director',250,230,'$arrMovie_id')";
    //echo $sql.'<br>';
   // echo $sql_movie;

    mysql_query($sql_movie);

    $sqlCount = 'select max(id) from jos_movie' or die("cannot select DB");
    $data = mysql_query($sqlCount);
    echo $data;
    print_r($data);
    $result = mysql_fetch_array($data);
    $id = $result[0];
    echo '<br>'.$id.'<br>'; 

    //$id = mysql_insert_id();
    //echo $id;

        // find all span tags with class=gb1
    foreach($html->find('div.moviTmngBox') as $e){
        $tagTitle =  $e->find('a',0);
        $tagTime  = $e->find('div.moviTimes',0);
        $name = $tagTitle->title;
        $time = $tagTime->innertext;

    $trimName = '';
    $temName = strtolower(str_replace(' ','',$name));

    if(strpos($temName,'indraaudi1') !== false)
      $trimName = 'Indra Audi 1' and  $cinemaId = '1' and $long='32.726602' and $lat='74.857026';
    elseif(strpos($temName,'indraaudi2') !== false)
     $trimName = 'Indra Audi 2' and $cinemaId = '2'and $long='32.726602' and $lat='74.857026';
    elseif(strpos($temName,'indraaudi3') !== false)
      $trimName = 'Indra Audi 3'and $cinemaId = '3' and $long='32.726602' and $lat='74.857026';
    elseif(strpos($temName,'apsra') !== false)
      $trimName = 'Apsra' and $cinemaId = '4' and $long='32.700314' and $lat='74.858023';
    else{
        $trimName = trim(substr($name,18,strlen($name))) and $cinemaId = '5' and $long='32.7300' and $lat='74.8700' ;
    }

        //echo $tagTime->innertext.'<br/>';
        $sql = "insert into jos_cinema(cinema_name,show_time,movie_id,cinemaId,logitude,latitude)values('$trimName','$time',$id,$cinemaId,$long,$lat)";
        //echo $sql.'<br/>';
        mysql_query($sql);
        //$arrTem = array($tagTitle->title,$tagTime->innertext);

    }

}//end rss feed loop

?>

Please not that I am inserting a default value of rating with a movie.

Thanks

Yogus
  • 2,307
  • 5
  • 20
  • 38

3 Answers3

1

First of all create a good DB structure:-

  1. Create a ID to uniquely define each movie in your tables.
  2. Normalize your table structure. Means keep movie details and active movie lists in separate tables(active_movies). Follow the below link for more info Normalization in MYSQL

  3. Instead of updating the complete row again and again for the same movie, just update the active movie table.

  4. keep a backup of the data means for all the movies, which can help you in future, to show the list and details of all the movies released in a year or including a particular cast, or rating above 4 etc.

A good DB structure will make your code a lot simpler and easily implementable. Few more helpful links http://searchbusinessintelligence.techtarget.in/tutorial/Database-normalization-in-MySQL-Four-quick-and-easy-steps

Community
  • 1
  • 1
Nishant
  • 3,614
  • 1
  • 20
  • 26
1

Probably best served with a staging table.

1) Load all your data

2) Delete everything where NOT EXISTS in your staging table

3) Update where exists/join to your staging table

4) Insert where NOT EXISTS in your proper table

The last 2 can likely be done with the MERGE statement or better still, the ETL tool that comes with your RDBMS

To be honest though, your data model is not particularly clear.

LoztInSpace
  • 5,584
  • 1
  • 15
  • 27
1

Ideally, the rating information should have been separated so that the movie rating information can be decoupled from the movie information. However, if this is not what you can implement in your situation, I would take two steps as below.

Identify removed movies and delete them from the movie table (this assumes that your new movie information is in the movie_new table)

DELETE y.*
  FROM
      (
       SELECT a.movie_name
         FROM jos_movie a
         LEFT OUTER JOIN movie_new b
           ON a.movie_name = b.movie_name
        WHERE b.movie_name IS NULL
      ) x, tbl_issue y
WHERE x.movie_name = y.movie_name

Add new movies to the movie table.

INSERT INTO jos_movie
SELECT b.*
  FROM jos_movie a
 RIGHT OUTER JOIN movie_new b
    ON a.movie_name = b.movie_name
 WHERE a.movie_name IS NULL

You can take the same steps to the other table I guess.