0

I got the following code:

<?php
    include('simple_html_dom.php');
    $html = file_get_html('http://www.google.com/search?q=BA236',false);
    $e = $html->find("div[class=g]");
echo $e[0]->innertext;
?>

When I run it I get the first class of a google search result, which is:

British Airways Flight 236

Scheduled   departs in 13 hours 13 mins

Departure   DME 5:40 AM     —

Moscow  Dec 15

Arrival LHR 6:55 AM     Terminal 5

London  Dec 15

Scheduled   departs in 1 day 13 hours

Departure   DME 5:40 AM     —

Moscow  Dec 16

Arrival LHR 6:55 AM     Terminal 5

London  Dec 16

My Problem is I dont need all that information and I have no idea how to filter this echo because the Html code has no id´s or classes. I thought about hiding the html I don´t need with jquery or simple css but: Same Problem, I have no id´s or classes to call them.

So how can I filter the information out I don´t want. Please just show me an example, I´ll check for the html I need to remove myself. Thanks.

1 Answers1

0

What you are searching for is called the grep tool (or regex). See SO site's PHP to search within txt file and echo the whole line for a possible answer. Slightly modified to your application:

$contents = 'British Airways Flight 236\n\nScheduled   departs in 13 hours 13 mins\n\nDeparture   DME 5:40 AM     —\n\Moscow  Dec 15\n\n...'

$searchfor = 'departs';

$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/^.*$pattern.*\$/m";
// search, and store all matching occurences in $matches
if (preg_match_all($pattern, $contents, $matches)) {
   echo "Found matches:\n";
   echo implode("\n", $matches[0]);
} else {
   echo "No matches found";
}

Edit:

Or, as mentioned in a comment, use ->saveHTML instead of ->innertext to preserve the HTML structure for easier parsing.

mab
  • 2,658
  • 26
  • 36
  • Thanks for your Answer but I need something universal because I want the flightnumber be able to change. If the Flight number changes the Text changes... –  Dec 14 '17 at 16:19
  • Maybe something like choose this and this information and dont touch the other info or something, I don´t know, maybe I should find only the elements I need in different echo´s. But same problem, how do I choose an element if it´s name is the same as 100 others ? –  Dec 14 '17 at 16:22
  • `$searchfor` is a variable and regular expressions are about as flexible as you can get with parsing this on your own. I just set `$contents` and `$searchfor` to fixed values for demonstration. You might as well search for a destination name from a list of airports for instance. – mab Dec 14 '17 at 16:31
  • But maybe `->innertext` is the actual problem here, since it throws away the html structure. You might want to use `->saveHTML` instead. – mab Dec 14 '17 at 16:35
  • so I can use $searchfor even if 90% of the innerhtml changes everytime I change the googlerequest/flightnumer ? –  Dec 14 '17 at 16:38
  • If you search for an element of the fixed template used by google (Flight, Scheduled, Departure, Arrival, Terminal, or departs), then yes. – mab Dec 14 '17 at 16:54