0

I have an application that stores twitter users' location in the database (from the API GET/search tweets - location). However the problem here is that it takes whatever the user has entered in his profile. Sample data of what is already included in my database:

All over the world & London
Boca Raton, Florida
Sunway Damansara, Malaysia
Munich
Brazil
40.769549,-73.993696
Long Island City, NY USA
United States
USA-Japan

As you can see there is a great variation (+junk info e.g. "Location: in my world") and that creates a big issue when I try to handle this data. What I want to do is have a script that will actually parse the values and update every row to a country (ex. Munich would change to Germany, NY could change to United States - since its only a prototype application, not a product). I am doing everything in PHP & MySQL.

Are there any suggestions on how to tackle this issue?

UPDATE

I found this example, which is sort of doing what I would like, but it does not work for every value. For instance when I enter London or Athens it shows United States, however when I enter Munich it shows Germany... Any idea why?

<?php
// Get STATE from Google GeoData
function reverse_geocode($address) {
    $address = str_replace(" ", "+", "$address");
    $url = "http://maps.google.com/maps/api/geocode/json?address=$address&sensor=false";
    $result = file_get_contents("$url");
    $json = json_decode($result);
    foreach ($json->results as $result)
    {
        foreach($result->address_components as $addressPart) {
            if((in_array('locality', $addressPart->types)) && (in_array('political', $addressPart->types)))
                $city = $addressPart->long_name;
            else if((in_array('administrative_area_level_1', $addressPart->types)) && (in_array('political', $addressPart->types)))
                $state = $addressPart->long_name;
            else if((in_array('country', $addressPart->types)) && (in_array('political', $addressPart->types)))
                $country = $addressPart->long_name;
        }
    }

    if(($city != '') && ($state != '') && ($country != ''))
        $address = $city.', '.$state.', '.$country;
    else if(($city != '') && ($state != ''))
        $address = $city.', '.$state;
    else if(($state != '') && ($country != ''))
        $address = $state.', '.$country;
    else if($country != '')
        $address = $country;

    // return $address;
    return "$country/$state/$city";
}

// Usage: In my case, I needed to return the State and Country of an address
$myLocation = reverse_geocode("Athens");
echo "$myLocation";
?>

Actually there is a priority on searching first in the US. So it shows the place called Athens and London in the US instead of Greece and United Kingdom. But how can someone change that? Is it even possible?

Andrew
  • 53
  • 10
  • I wonder if you could feed it to something like Google Maps API and it would tell you (if valid, the first and last are not) what the country of origin is. (Although technically the first returns London in a Google Maps search.) – Jared Farrish Mar 22 '14 at 13:26
  • Might help: http://stackoverflow.com/questions/4013606/google-maps-how-to-get-country-state-province-region-city-given-a-lat-long-va – Jared Farrish Mar 22 '14 at 13:28
  • Well, started experimenting a bit with google maps API and also found this one: [example](http://www.internoetics.com/2012/02/09/country-state-city-from-googles-geocoding-api/) It actually works for some places but not for everything. For example if I enter Athens or London it shows United States, however when I enter Munich it shows Germany.. I wonder why.. I will update my question by adding that – Andrew Mar 22 '14 at 14:11
  • You should include the code you've got in your question. – Jared Farrish Mar 22 '14 at 14:16
  • @JaredFarrish just included the code which I found and trying to adjust in my case. – Andrew Mar 22 '14 at 14:23
  • Y'know, hmm. How would you distinguish from someone in Athens, Greece from Athens, Georgia from a single field? From a Twitter point of view, the user could be in either (both?). Prioritizing to international won't solve the problem since it will show all in one city, although Twitter has many users in the other. Seems like garbage data. You would need some other piece of data to validate "origin" in coordination with this field, I would think. – Jared Farrish Mar 22 '14 at 14:30
  • @JaredFarrish you are right on that. Actually its not good data, but this is the only I have now and I will work this script as such. It's for practising purpose anyway. :) But, do you have an idea if it's possible to change the prioritization of the returned results? – Andrew Mar 22 '14 at 14:42
  • I don't think Google Maps API provides a way to do that; what you want is to essentially return all results internationally, but it appears that it only returns a [region bias](https://developers.google.com/maps/documentation/geocoding/#RegionCodes) for where the request originates (or it defaults to US, if region left blank). You're back to where you started: You need to know the region. Something you might fiddle with is providing multiple region codes (`region[0]=gb&region[1]=us&...` or `region=gb,us,sp...`). It's worth a shot. Or try [Bing Maps](http://www.microsoft.com/maps/)? – Jared Farrish Mar 22 '14 at 14:57

1 Answers1

0

Since I found a solution to my problem I post here also the code solving the problem above from the code posted. I could get the value of city, state, country in first iteration so it was performing unnecessary iterations and therefore getting results from US as I saw in the json returned. Now I will also parse the data to make junk data such as "in my world etc" excluded and I will be able to group them by country.

<?php
// Get STATE from Google GeoData
function reverse_geocode($address) {
    $address = str_replace(" ", "+", "$address");
    $url = "http://maps.google.com/maps/api/geocode/json?address=$address&sensor=false";
    $result = file_get_contents("$url");
    $json = json_decode($result);
    foreach ($json->results as $k => $result)
    {
        if ($k == 0) {
            foreach($result->address_components as $addressPart) {
                if ((in_array('locality', $addressPart->types)) && (in_array('political', $addressPart->types))) {
                    $city = $addressPart->long_name;
                } else if((in_array('administrative_area_level_1', $addressPart->types)) && (in_array('political', $addressPart->types))) {
                    $state = $addressPart->long_name;
                }
                else if((in_array('country', $addressPart->types)) && (in_array('political', $addressPart->types))) {
                    $country = $addressPart->long_name;
                }
            }
        } else {
            break; //Will comes out of loop directly after getting result.
        }
    }

    if($country != '')
        $address = $country;

    // return $address;
    return "$country";
}

// Usage: In my case, I needed to return the State and Country of an address
$myLocation = reverse_geocode("Athens");
echo "$myLocation";
?>

@JaredFarrish thanks for the help you provided in the first part :)

Andrew
  • 53
  • 10