I have a DataFrame with tweets. I want to select only two specific columns ('id' and 'text'). However, I keep having problems with the 'id' column. This works:
import pandas as pd
tweets = pd.read_csv('alltweets.csv')
specified_tweets = tweets[['text']]
But this gives an error:
specified_tweets = tweets[['id','text']]
KeyError: "['id'] not in index"
But 'id' is definitely in the index:
tweets.columns
Index(['id', 'time', 'created_at', 'from_user_name', 'text', 'filter_level',
'possibly_sensitive', 'withheld_copyright', 'withheld_scope',
'truncated', 'retweet_count', 'favorite_count', 'lang', 'to_user_name',
'in_reply_to_status_id', 'quoted_status_id', 'source', 'location',
'lat', 'lng', 'from_user_id', 'from_user_realname',
'from_user_verified', 'from_user_description', 'from_user_url',
'from_user_profile_image_url', 'from_user_utcoffset',
'from_user_timezone', 'from_user_lang', 'from_user_tweetcount',
'from_user_followercount', 'from_user_friendcount',
'from_user_favourites_count', 'from_user_listed',
'from_user_withheld_scope', 'from_user_created_at'],
dtype='object')
EDIT: This is what the data looks like:
{'created_at': {0: '2018-02-13 13:14:08', 2: '2018-02-13 13:14:23'},
'favorite_count': {0: 0, 2: 0},
'filter_level': {0: 'low', 2: 'low'},
'from_user_created_at': {0: '2011-07-28 13:56:37', 2: '2017-10-14 13:21:03'},
'from_user_description': {0: "Feyenoord..... en me lieverd natuurlijk! Anti islam, lasciate ogne speranza voi ch'entrate", 2: "The world has 2 mayor problems: SSocialism and isLam. Without those ideologies we didn't have wars, mass migration and terrorism. http://Gab.ai/Diver"},
'from_user_favourites_count': {0: 3630, 2: 0},
'from_user_followercount': {0: 594, 2: 479},
'from_user_friendcount': {0: 592, 2: 524},
'from_user_id': {0: 344062208, 2: 919191322162024448},
'from_user_lang': {0: 'nl', 2: 'nl'},
'from_user_listed': {0: 129, 2: 1},
'from_user_name': {0: 'Ratatouile1', 2: 'DuikerT3'},
'from_user_realname': {0: 'Rat', 2: 'Gab.ai/Diver'},
'from_user_timezone': {0: 'Amsterdam', 2: 'Hanoi'},
'from_user_tweetcount': {0: 14077, 2: 17775},
'from_user_url': {0: nan, 2: 'url'},
'from_user_utcoffset': {0: 3600.0, 2: 25200.0},
'from_user_verified': {0: 0, 2: 0},
'from_user_withheld_scope': {0: nan, 2: nan},
'in_reply_to_status_id': {0: nan, 2: nan},
'lang': {0: 'nl', 2: 'nl'},
'lat': {0: nan, 2: nan},
'lng': {0: nan, 2: nan},
'location': {0: 'Schiedam', 2: 'Thailand'},
'possibly_sensitive': {0: nan, 2: nan},
'quoted_status_id': {0: nan, 2: nan},
'retweet_count': {0: 0, 2: 0},
'source': {0: '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>', 2: '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>'},
'text': {0: 'RT @Derksen_Gelul: Dus #Zijlstra loog en toen hij betrapt werd kwam hij weer met een nieuwe leugen, dit kan zelfs #Rutte niet meer rec… ', 2: 'RT @MikevdGalienNL: Nee, #Rutte, de inhoud van het verhaal van #Zijlstra staat niet. Het is 100% duidelijk dat hij de boel helemaal bij… '},
'time': {0: 1518527648, 2: 1518527663},
'to_user_name': {0: nan, 2: nan},
'truncated': {0: nan, 2: nan},
'withheld_copyright': {0: nan, 2: nan},
'withheld_scope': {0: nan, 2: nan},
'\ufeffid': {0: 963400901305217024, 2: 963400963934642178}}