0

I have a text file filled with place data provided by twitter api. Here is the sample data of 2 lines

{'country': 'United Kingdom', 'full_name': 'Dorridge, England', 'id': '31fe56e2e7d5792a', 'country_code': 'GB', 'name': 'Dorridge', 'attributes': {}, 'contained_within': [], 'place_type': 'city', 'bounding_box': {'coordinates': [[[-1.7718518, 52.3635912], [-1.7266702, 52.3635912], [-1.7266702, 52.4091167], [-1.7718518, 52.4091167]]], 'type': 'Polygon'}, 'url': 'https://api.twitter.com/1.1/geo/id/31fe56e2e7d5792a.json'}

{'country': 'India', 'full_name': 'New Delhi, India', 'id': '317fcc4b21a604d5', 'country_code': 'IN', 'name': 'New Delhi', 'attributes': {}, 'contained_within': [], 'place_type': 'city', 'bounding_box': {'coordinates': [[[76.84252, 28.397657], [77.347652, 28.397657], [77.347652, 28.879322], [76.84252, 28.879322]]], 'type': 'Polygon'}, 'url': 'https://api.twitter.com/1.1/geo/id/317fcc4b21a604d5.json'}

I want 'country', 'name' and 'cordinates' filed of each line.In order to do this we need to iterate line by line the entire file.so i append each line to a list

data = []
with open('place.txt','r') as f:
    for line in f:
        data.append(line)

when i checked the data type it shows as 'str' instead of 'dict'.

type(data[0])
str

data[0].keys()
AttributeError: 'str' object has no attribute 'keys'

how to fix this so that it can be saved as list of dictionaries.

Originally tweets were encoded and decoded by following code:

f.write(jsonpickle.encode(tweet._json, unpicklable=False) + '\n') #encoded and saved to a .txt file
tweets.append(jsonpickle.decode(line)) # decoding

And place data file is saved by following code:

fName = "place.txt"
newLine = "\n"
with open(fName, 'a', encoding='utf-8') as f:
    for i in range(len(tweets)):
        f.write('{}'.format(tweets[i]['place']) +'\n')
Khurshid
  • 458
  • 1
  • 5
  • 20
  • 1
    You're reading a string that looks like `{'country':'United Kingdom' ,...}`, etc. However, you want to parse this and turn it into a dictionary. I recommend using a JSON parser to make your job easier. :) – apnorton Oct 12 '16 at 19:08
  • To add to @apnorton's comment, Python ships with a [JSON library](https://docs.python.org/2/library/json.html) – UnholySheep Oct 12 '16 at 19:09
  • The rational solution is saving the files in Json format and easily loading the data in expected format. If it's not possible you can use `ast.literal_eval` in order to evaluate the string as python objects. – Mazdak Oct 12 '16 at 19:11

4 Answers4

2

In your case you should use json to do the data parsing. But if you have a problem with json (which is almost impossible since we are talking about an API ), then in general to convert from string to dictionary you can do:

>>> import ast
>>> x = "{'country': 'United Kingdom', 'full_name': 'Dorridge, England', 'id': '31fe56e2e7d5792a', 'country_code': 'GB', 'name': 'Dorridge', 'attributes': {}, 'contained_within': [], 'place_type': 'city', 'bounding_box': {'coordinates': [[[-1.7718518, 52.3635912], [-1.7266702, 52.3635912], [-1.7266702, 52.4091167], [-1.7718518, 52.4091167]]], 'type': 'Polygon'}, 'url': 'https://api.twitter.com/1.1/geo/id/31fe56e2e7d5792a.json'}
"
>>> d = ast.literal_eval(x)
>>> d

d now is a dictionary instead of a string. But again if your data are in json format python has a built-in lib to handle json format, and is better and safer to use json than ast.

For example if you get a response let's say resp you could simply do:

response = json.loads(resp)

and now you could parse response as a dictionary.

coder
  • 12,832
  • 5
  • 39
  • 53
  • A second one is better – Dmitry Zagorulkin Oct 12 '16 at 19:17
  • @ZagorulkinDmitry, if you mean json, yes totally agree, it is a lot better in that cases when have to do with an API. – coder Oct 12 '16 at 19:19
  • See [Is using eval in Python a bad practice?](http://stackoverflow.com/questions/1832940/is-using-eval-in-python-a-bad-practice) – C8H10N4O2 Oct 12 '16 at 19:26
  • @C8H10N4O2, yes I know that `eval` is not safe to use that is why I used `literal_eval` instead which is a lot safer. But I also mentioned in my answer and in the above comment that for those cases where json can be used, should be used instead. But anyway since the response comes from twitter I don't think there is a matter of code injection :) But again yes I agree that in general `ast.eval` is dangerous! – coder Oct 12 '16 at 19:33
  • json.loads() gives error "SyntaxError: unexpected EOF while parsing " – Khurshid Oct 13 '16 at 17:48
  • sorry for the above 2 comments,they are wrongly posted and due to accidental enter key hit. d = ast.literal_eval(x) this line give the error "SyntaxError: unexpected EOF while parsing" and json.loads() gives error "JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)" – Khurshid Oct 13 '16 at 17:58
  • @KhurshidKhan, hm.., first of all make sure that your data are in a valid json format as mentioned in detail in the other answers(about the double and single quotes). If still there is an error then it may be caused by many reasons. The first thing I would check is to see if there are any special characters in the file contents and if there are specify the encoding. – coder Oct 13 '16 at 17:58
  • @KhurshidKhan, ok, just saw the last comment. So problem resolved? – coder Oct 13 '16 at 17:59
  • It comes that my data was not valid json ,converting single quotes to double quotes and then json.loads() works. Thanks for your help – Khurshid Oct 14 '16 at 17:28
1

Note: Single quotes are not valid JSON.

I have never tried Twitter API. Looks like your data are not valid JSON. Here is a simple preprocess method to replace '(single quote) into "(double quote)

data = "{'country': 'United Kingdom', ... }"

json_data = data.replace('\'', '\"')
dict_data = json.loads(json_data)
dict_data.keys()
# [u'full_name', u'url', u'country', ... ]
Kir Chou
  • 2,980
  • 1
  • 36
  • 48
  • there is no quotes either single or double at the end and start of each line,so i think we need to add instead of replace double quotes at the start and end of the line. How can we achieve this? – Khurshid Oct 14 '16 at 16:56
  • thank you very much converting single quotes to double quotes works.I am a naive programmer,i interpreted your answer wrongly at first,really sorry for that – Khurshid Oct 14 '16 at 17:26
1

You should use python json library for parsing and getting the value. In python it's quite easy.

import json
x = '{"country": "United Kingdom", "full_name": "Dorridge, England", "id": "31fe56e2e7d5792a", "country_code": "GB", "name": "Dorridg", "attributes": {}, "contained_within": [], "place_type": "city", "bounding_box": {"coordinates": [[[-1.7718518, 52.3635912], [-1.7266702, 52.3635912], [-1.7266702, 52.4091167], [-1.7718518, 52.4091167]]], "type": "Polygon"}, "url": "https://api.twitter.com/1.1/geo/id/31fe56e2e7d5792a.json"}'
y = json.loads(x)
print(y["country"],y["name"],y["bounding_box"]["coordinates"])
Aman Jaiswal
  • 1,084
  • 2
  • 18
  • 36
0

You can use list like this

mlist= list() 
for i in ndata.keys(): 
    mlist.append(i)
Kate
  • 320
  • 1
  • 6
  • 19