2

I'm trying to find Countries/Cities on the webpage. So I used Geograpy. but it is Not working properly. Note: given website contains All the States in United States Website = http://state.1keydata.com/

import geograpy
url='http://state.1keydata.com/'
place=geograpy.get_place_context(url=url)
print place.countries  #[]
print place.cities #[]

I have installed all the required packages like georapy,nltk(all) I am using Anaconda.

Please guide if I'm wrong.

Thank you in advance :)

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186

3 Answers3

2

The page you would like to test is on site with an improper certificate which leads to a different problem i didn't try to solve. Instead i am using: https://en.wikipedia.org/wiki/U.S._state

as the example.

As a committer of geograpy3 to reproduce your issue i added a test to the most recent geograpy3 https://github.com/somnathrakshit/geograpy3/blob/master/tests/test_extractor.py:

def testStackoverflow43322567(self):
        '''
        see https://stackoverflow.com/questions/43322567/python-geograpy-is-not-finding-cities-in-usa
        '''
        url='https://en.wikipedia.org/wiki/U.S._state'
        e=Extractor(url=url)
        places=e.find_geoEntities()
        self.check(places,['Alabama','Virginia','New York'])
Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186
0

The page you tested on doesn't contain any city or country names, so it's not surprising that you get empty results.

alexis
  • 48,685
  • 16
  • 101
  • 161
  • Please help me to find the states in that page – iamchellapandi Apr 10 '17 at 13:12
  • Try `place.regions` or `place.other`. Just reading off the project page... – alexis Apr 10 '17 at 13:17
  • The project is meant to extract city, region and country names out of *ordinary text.* Why are you testing it on a table of US states, which are none of these? Test it on something like the news text given in the project front page, and look for cities. – alexis Apr 10 '17 at 13:46
  • My project meant to extract country in the given webpage(whatever the page). i have to traverse their contact us page and find the webpage address(country). If country not there in the website i have to fine the state or city. help of those state/city i need to find their country. i have a code, it can able to traverse contact us page. Please guide me if possible. Thank you in Advance @alexis – iamchellapandi Apr 10 '17 at 15:11
0

I found that re-installing all required packages manually, as well as adding a tweak to the geography library files did the trick. Check this for more details.

  1. lxml
  2. beautifulsoup
  3. pillow

Next, I ran the command python nltk.download() from the command line

After doing these steps, I got another error message:

Traceback (most recent call last):
  File "ExtractLocation_geograpy.py", line 5, in <module>
    places = geograpy.get_place_context(text = text1)
  File "C:\Users\Avardhan\Documents\CVS_POC\.env\lib\site-packages\geograpy\__init__.py", line 11, in get_place_context
    pc.set_cities()
  File "C:\Users\Avardhan\Documents\CVS_POC\.env\lib\site-packages\geograpy\places.py", line 174, in set_cities
    self.country_cities[country.name] = []

By replacing country.name with country_name, I was able to finally get the required output.

CDspace
  • 2,639
  • 18
  • 30
  • 36
avcodes
  • 51
  • 1
  • 1