0

This is the error that I'm getting

ERROR:scrapy.core.scraper:Error processing {'action': u'Rent',
 'ad_images': [bla bla bla],
 'ad_link': u'does_not_exit_in_this_website',
 'ad_title': u'boa bra bra',
 'agent_fees': 2300.0,
 'amenities': u'boa bra bra',
 'area': u'does_not_exit_in_this_website',
 'bathrooms': 1.0,
 'bedrooms': u'1',
 'building': u'',
 'category': u'Apartment',
 'city': -1,
 'commission': u'does_not_exit_in_this_website',
 'coordinates': u'does_not_exit_in_this_website',
 'country': u'',
 'ded_licence_number': u'718652',
 'description': u'Description:',
 'furnished': u'No',
 'latitude': -1,
 'link': u'bla bla bla',
 'location': u'',
 'longitude': -1,
 'mobile': u'does_not_exit_in_this_website',
 'payment_type': u'does_not_exit_in_this_website',
 'phone': u'',
 'phoneticarea': u'does_not_exit_in_this_website',
 'phoneticbuilding': u'does_not_exit_in_this_website',
 'phoneticsubarea': u'does_not_exit_in_this_website',
 'posting_date': u'2016-01-04',
 'price': u'does_not_exit_in_this_website',
 'price_sqft': u'does_not_exit_in_this_website',
 'property_reference': u'Ramzi',
 'rent_is_paid': u'Quarterly',
 'rera_registration_number': u'15691',
 'security_deposit': u'does_not_exit_in_this_website',
 'size': 1000.0,
 'source': u'dubizzleproperty',
 'subarea': u'does_not_exit_in_this_website',
 'trade_name': u'BLUE HOME PROPERTIES',
 'type': u'does_not_exit_in_this_website',
 'yearly_cost': 43000.0}
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/twisted/internet/defer.py", line 588, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "bra bra bla/pipelines.py", line 70, in process_item
    body = '{"building": "{0}", "area" : "{1}", "subarea" : "{2}", "country" : "{3}", "city" : "{4}", "payment_type" : "{5}", "category" : "{6}", "phoneticbuilding" : "{7}", "phoneticarea" : "{8}", "phoneticssubarea": "{9}" }'.format(building, area, subarea, country, city, payment_type, category, phoneticbuilding, phoneticarea, phoneticssubarea)
KeyError: '"building"'

and if you want to know the line 70 of the pipelines file, here you go:

body = '{"building": "{0}", "area" : "{1}", "subarea" : "{2}", "country" : "{3}", "city" : "{4}", "payment_type" : "{5}", "category" : "{6}", "phoneticbuilding" : "{7}", "phoneticarea" : "{8}", "phoneticssubarea": "{9}" }'.format(building, area, subarea, country, city, payment_type, category, phoneticbuilding, phoneticarea, phoneticssubarea)
Marco Dinatsoli
  • 10,322
  • 37
  • 139
  • 253

2 Answers2

1

As @arthur says you need to replace the single curly brackets with two brackets. I would add that you should do it in the text that will be the key value of the JSON string:

For example (short version of string), replace:

body = '{"building": "{0}", "area" : "{1}" }'.format(building, area)

with:

body = '{{"building": "{0}", "area" : "{1}" }}'.format(building, area)
Carlos Peña
  • 224
  • 2
  • 10
0

Try to escape the non-format { and } brackets of the body.

Edit : That means replacing them by {{ and }}. See How can I print literal curly-brace characters in python string and also use .format on it?

Community
  • 1
  • 1
Diane M
  • 1,503
  • 1
  • 12
  • 23
  • Those are important to create a JSON file, otherwise, it wouldn't be a JSON formatted string. Don't you think? – Marco Dinatsoli Jan 03 '16 at 23:28
  • __Escaping__ (not remove, replace by `{{` and `}}`) will avoid `.format` to interpret brackets as replacement directive. They are reserved characters. – Diane M Jan 03 '16 at 23:36
  • if you know this body = '\{"building": "{0}", "area" : "{1}", "subarea" : "{2}", "country" : "{3}", "city" : "{4}", "payment_type" : "{5}", "category" : "{6}", "phoneticbuilding" : "{7}", "phoneticarea" : "{8}", "phoneticssubarea": "{9}" \}'.format(building, area, subarea, country, city, payment_type, category, phoneticbuilding, phoneticarea, phoneticssubarea) – Marco Dinatsoli Jan 03 '16 at 23:58