0

Sorry, but I have absolutely no idea what the terminology to use here is, which also makes it impossible to search for what I want to do. I only found out about SPARQL about an hour ago.

Basically, I have 475 cities that I want to know the areas of. In the course of searching various things looking for a pre-existing list (or even a very basic GIS guide) to find this, one of my results pointed out that Wikipedia has the areas for all of those cities. Unfortunately, I couldn't figure out how to get the areas for multiple cities at once.

What I can do is very, very basic. Based on the second Google Search result (I closed the other tab ages ago), I can make very basic changes to Jan Drewniak's answer. So, in principle, I can go to query.wikidata.org and find the area of each city individually by changing "Paris" in:

SELECT ?town ?area ?population ?coordinate ?country WHERE {
  ?town ?label "Paris"@en;
    wdt:P2046 ?area;
    wdt:P625 ?coordinate;
    wdt:P1082 ?population;
    wdt:P17   ?country.
}

And having done that, I can download the result and then change Paris again to one of the other 474 cities, download that result and so on until I've done this for all 475 cities. Then I can combine all 475 .csv files. That would work.

Obviously, I'd rather not do that. Tomorrow's Sunday, so I could but it would take ages. What I'd like to be able to do is:

  1. run a single query that includes all 475 cities, is that possible?
  2. get the country to report in terms that aren't wd:Q30, is that possible?
  3. be able to tell if the results I'm getting for area are all the same unit, ideally sqkm but conversions aren't an issue, is that possible?
  4. if it is possible to do all 475 at once, would I be able to reference the names in a .csv file?

I should also note that query.wikidata.org/ is the only place where I know to run this.

If there's some other list made by someone else of the areas of the cities in the UN's World Urbanization Prospects data of cities over 300,000 (which is where my 475 cities were culled from), then that would also work. (On a related note, Demographia has some PDF lists of Urban Areas over 1,000,000 that does have area information... if I try and copy and paste that, it just comes out as a single line, not a table. If I were to give up and find which of my 475 cities are in that list, how would I proceed?)

I've tried the following:

SELECT ?town ?area ?population ?coordinate ?country WHERE {
  ?town ?label "Paris"@en;
    wdt:P2046 ?area;
    wdt:P625 ?coordinate;
    wdt:P1082 ?population;
    wdt:P17   ?country.
}
SELECT ?town ?area ?population ?coordinate ?country WHERE {
  ?town ?label "London"@en;
    wdt:P2046 ?area;
    wdt:P625 ?coordinate;
    wdt:P1082 ?population;
    wdt:P17   ?country.
}

But query.wikidata.org gave me an error and also variations on "Paris" | "London"@en or "Paris"@en | "London"@en, by analogy to R.

As to the tags, I've just copied those from the question where I got the above code model plus added gis and SPARQL ones.

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
Vonvorv
  • 21
  • 4
  • you can use SPARQL `VALUES` feature: `SELECT ?town ?area ?population ?coordinate ?country WHERE { VALUES ?label {"Paris"@en "London"@en} ?town rdfs:label ?label ; wdt:P2046 ?area; wdt:P625 ?coordinate; wdt:P1082 ?population; wdt:P17 ?country. }` - clearly, you'll get multiple results for London and also other cities as there are indeed multiple cities with the same name around the world. This is the point where you have to think about how you'll resolve this issue – UninformedUser Nov 28 '21 at 16:19
  • Thanks. Resolving multiple hits for different cities by the same name isn't/shouldn't be a problem. I'm more concerned that, for example, the same Durango repeats 4 times in the data. One Durango shows up 4 times because there are two area and two population values so that's one instance for each pair. Is there some way of getting extra details, e.g. that 17,700,000 is "square metres" and 25.788083 "square kilometre" (obv. there's an underlying error here). – Vonvorv Nov 29 '21 at 13:12
  • you can at least select the most recent value: – UninformedUser Nov 30 '21 at 07:47
  • `SELECT ?town ?area ?population ?coordinate ?country { VALUES ?label { "Durango"@en } ?town rdfs:label ?label; wdt:P625 ?coordinate; wdt:P17 ?country . ?town p:P2046 ?area_stmt . ?area_stmt ps:P2046 ?area ; pq:P585 ?area_date . FILTER NOT EXISTS { ?town p:P2046/pq:P585 ?area_date_ . FILTER (?area_date_ > ?area_date) } ?town p:P1082 ?population_stmt . ?population_stmt ps:P1082 ?population ; pq:P585 ?pop_date FILTER NOT EXISTS { ?town p:P1082/pq:P585 ?pop_date_ . FILTER (?pop_date_ > ?pop_date) } }` – UninformedUser Nov 30 '21 at 07:48
  • Thanks for the help. Ultimately, I've decided to go in a different direction. I don't know why I never found it originally but the OECD has a thing called Functional Urban Areas and as part of that it has a thing called citytools, which has a list of eFUAs, their locations, their areas and their populations, i.e. pretty much exactly what I wanted to start with. https://www.oecd.org/regional/regional-statistics/ – Vonvorv Nov 30 '21 at 12:46

0 Answers0