0

I'm trying to run some statistics over the Stack Overflow data dump, and for that I would like to know the time zone for each user. However, all I have to go on is the completely free-form "location" string.

I'll stress that I'm only looking for an approximation of the time zone; of course, in general this is an unsolvable problem. However, many people fill out their country, state and/or city, which should give a pretty good indication. It's okay if it fails for other cases. It doesn't have to be reliable, it doesn't have to be accurate, it doesn't have to cover all bases.

I don't want to waste too much time on this, so I'm wondering if there is some code out there that can make a reasonable guess. Any language, platform, API or library goes. Any ideas?

Thomas
  • 174,939
  • 50
  • 355
  • 478

1 Answers1

0

Check this discussion for information on how to get the lat/lon from an arbitrary location string.

Once you have the lat/lon, you can use the web services at GeoNames to retrieve the time zone.

Community
  • 1
  • 1
newdayrising
  • 3,762
  • 4
  • 27
  • 31
  • Thanks! I hadn't found that yet. Looks like GeoNames.org (http://www.geonames.org/) can do fuzzy string to latitude/longtitude, and latitude/longtitude to time zone. That should be all I need. At 50,000 requests per day, it'd take a week to look up all users, but I can think of many ways to reduce the number of queries. – Thomas Apr 10 '10 at 20:02