3

I just try to extract location from a text using the geograpy3 library. But it throws an error.

enter image description here

for content in feedContent:

    if content != "":
        place = geograpy.get_place_context(text=content)
        placesInFeed.append(place.places)

    else:
        placesInFeed.append("null")

The result is

Traceback (most recent call last):
  File "C:/Users/Peshala/Documents/SDGP/Location-based-news-recommendation-master/Backend/rss_scraper.py", line 46, in <module>
    place = geograpy.get_place_context(text=content)
  File "C:\Users\Peshala\PycharmProjects\Location-based-news-recommendation\venv\lib\site-packages\geograpy\__init__.py", line 11, in get_place_context
    pc.set_cities()
  File "C:\Users\Peshala\PycharmProjects\Location-based-news-recommendation\venv\lib\site-packages\geograpy\places.py", line 137, in set_cities
    self.populate_db()
  File "C:\Users\Peshala\PycharmProjects\Location-based-news-recommendation\venv\lib\site-packages\geograpy\places.py", line 30, in populate_db
    for row in reader:
  File "C:\Users\Peshala\AppData\Local\Programs\Python\Python36\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 277: character maps to <undefined>
Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186

1 Answers1

1

As a committer of geograpy3 to reproduce your issue i added a test to the most recent geograpy3 https://github.com/somnathrakshit/geograpy3/blob/master/tests/test_extractor.py:

    def testStackoverflow55548116(self):
        '''
        see https://stackoverflow.com/questions/55548116/geograpy3-library-is-not-working-properly-and-give-traceback-error
        '''
        feedContent=['Las Vegas is a city in Nevada']
        placesInFeed=[]
        
        for content in feedContent:
            if content != "":
                e=Extractor(text=content)
                e.find_entities()
                places = e.places
                if self.debug:
                    print(places)
                placesInFeed.append(places)  

The result might not be what you expect:

['Las', 'Vegas', 'Nevada']

but the test does not show any error so please supply the feedContent that does - you might want to fork the project and modify the test and add a pull request for your problem.

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186
  • maybe you could put 0x8d to your test feed to reproduce what happened? it seems folks have dealth with such content there, https://stackoverflow.com/questions/30598350/unicodedecodeerror-charmap-codec-cant-decode-byte-0x8d-in-position-7240-cha – antont Sep 09 '20 at 10:06
  • In that case it would not be an error of the library but not feeding utf-8 into it which is currently not supported but would certainly be feasible just file an issue at https://github.com/somnathrakshit/geograpy3/issues if that is needed with a full example. – Wolfgang Fahl Sep 09 '20 at 10:11
  • 1
    I don't know if support for non utf-8 is needed there, maybe the correct answer is just to convert the input to utf-8. – antont Sep 09 '20 at 10:57