13

Wikipedia is geotagging a lot of its articles. (Look in the top right corner of the page.)

Is there any API for querying all geotagged pages within a specified radius of a geographical position?

Update

Okay, so based on lost-theory's answer I tried this (on DBpedia query explorer):

PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?subject ?label ?lat ?long WHERE {
    ?subject geo:lat ?lat.
    ?subject geo:long ?long.
    ?subject rdfs:label ?label.
    FILTER(xsd:float(?lat) - 57.03185 <= 0.05 && 57.03185 - xsd:float(?lat) <= 0.05
        && xsd:float(?long) - 9.94513 <= 0.05 && 9.94513 - xsd:float(?long) <= 0.05
        && lang(?label) = "en"
    ).
} LIMIT 20

This is very close to what I want, except it returns results within a (local) square around the point and not a circle. Also I would like if the results where sorted based on the distance from the point. (If possible.)

Update 2

I am trying to determine the euclidean distance as an approximation of the true distance, But I am having trouble on squaring a number in SPARQL. (Question opened here.) When I get something useful I will update the question, but in the meantime I will appreciate any suggestions on alternative approaches.

Update 3

A final update. I gave up on using SPARQL through DBpedia. I have written a simple parser which fetches the Wikipedia article text nightly database dump and parses all articles for geocodes. It works rather nicely and it allows me to store information about geotagged articles however I wish.

This is probably the solution I will continue using, and if I get around to create a nice interface to it I might consider allowing public API access and/or publishing the source to the parser.

Community
  • 1
  • 1
Bjarke Freund-Hansen
  • 28,728
  • 25
  • 92
  • 135

6 Answers6

3

You should be able to query for latitude/longitude using SPARQL and dbpedia. An example (from here):

SELECT distinct ?s ?la ?lo ?name ?country WHERE {
?s dbpedia2:latitude ?la .
?s dbpedia2:longitude ?lo .
?s dbpedia2:officialName ?name .
?s dbpedia2:country ?country .
filter (
  regex(?country, 'England|Scotland|Wales|Ireland')
  and regex(?name, '^[Aa]')
)
}

You can run your own queries here.

Steven Kryskalla
  • 14,179
  • 2
  • 40
  • 42
  • Very interesting. I am unsure about this SPARQL syntax, and how to perform a query for all articles within a specific area (defined by latitude, longitude and radius) ? – Bjarke Freund-Hansen Sep 09 '09 at 17:02
  • 1
    I'm unsure if SPARQL supports trigonometrical functions (it doesn't appear to); but you could filter your data set to a square to get a first "cut" of results, and then do great circle distances "client side", and apply a second set of filtering. – Rowland Shaw Sep 10 '09 at 07:55
3

The OpenLink Virtuoso server used by the dbpedia endpoint has several query features. I found the information on http://docs.openlinksw.com/virtuoso/rdfsparqlgeospat.html useful for a similar problem.

I ended up with a query such as this:

SELECT ?page ?lat ?long (bif:st_distance(?geo, bif:st_point(15.560278, 58.394167)))
WHERE{
    ?m foaf:page ?page.
    ?m geo:geometry ?geo.
    ?m geo:lat ?lat.
    ?m geo:long ?long.
    FILTER (bif:st_intersects (?geo, bif:st_point(15.560278, 58.394167), 30))
}
ORDER BY ASC 4 LIMIT 15

This example retrieves the geotagged locations within 30 km from the origin position.

asynja
  • 46
  • 1
  • 2
1

The free GeoNames.org FindNearbyWikipedia service can fetch geotagged articles for a give postal code or coordinates (latitude, longitude)

It provides 30,000 credits daily limit per application (identified by the parameter 'username'), the hourly limit is 2000 credits. A credit is a web service request hit for most services. An exception is thrown when the limit is exceeded.

mvark
  • 2,105
  • 2
  • 21
  • 36
1

There are a couple of tools listed on Tools and applications based on coordinates from Wikipedia. I'm not sure if it's what you're looking for, but the Geosearch.py tool looks pretty cool.

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
1

Not an API, but you can also download this nice set of all geo-tagged wikipedia articles and query it directly in a local database: http://www.google.com/fusiontables/DataSource?dsrcid=423292

Stan James
  • 2,535
  • 1
  • 28
  • 35
0

I'm not familiar enough with SPARQL, but if it can use power in its filter then its easy to compute the distance of a given article from a given point using Pythagoras theorem (a^2 + b^2 = c^2) and that would give you all the articles in a radius.

Another option would be to get a Wikipedia data dump and process it yourself - this is what I did when I needed to do some linguistic analysis on Wikipedia article.

Guss
  • 30,470
  • 17
  • 104
  • 128
  • This is what I am trying to get working right now. The results would be off close to the poles or at large radii, as latitude and longitude are not cartesian coordinates, but will probably be approximately okay locally. However I simply have no idea how to computer the power of something in sparql, or even of where to look up how to compute the power. I opened a question on it here: http://stackoverflow.com/questions/1401401/power-in-sparql-and-other-math-functions When I reach a satisfying solution I will update the question, but until then, I will appreciate any suggestions. :) – Bjarke Freund-Hansen Sep 09 '09 at 18:53
  • I've looked in the SPARQL reference on W3 before putting this answer and the math operation I've seen there are less then satisfactory. That being said, there was some discussion on adding operators using embedded Javascript, which may be a solution but I didn't dive into that due to lack of time. – Guss Sep 09 '09 at 19:35
  • Sounds what I have found. I guessed the square root operator (math:sqrt) which works, but even that seems not to be documented at the W3 page. And this is not for displaying on a web-page, so I am unsure how any javascript solution will help, (though I noticed that discussion myself.) – Bjarke Freund-Hansen Sep 09 '09 at 19:44
  • Its quite possible for a SPARQL processor to have a javascript parser to handle that. If you can get `math:sqrt` to work, then `math:pow` may also work. – Guss Sep 09 '09 at 20:02
  • math:pow didn't work for me, nor trying to multiply value by themselves (some compiler error about syntax error at '(' which I didn't understand). – Guss Sep 09 '09 at 20:13
  • My problem exactly. What I really need is a good specification of the SPARQL syntax and available 'libraries'. – Bjarke Freund-Hansen Sep 10 '09 at 06:45