-1

This is in continuation for my previous question on keyword extraction from a string in R: Extract a specific key word from a string in R

I have written the following code that returns the keyword as i wish:

loc <- t1$place
loc <- gsub('"', '', loc)
loc <- gsub(',', '', loc)
for(i in 1:nrow(t1)) 
  country <- word(loc[i], 19, sep=fixed(" : "))
country <- gsub(' }', '', country)

The for loop does not seem to work correctly. When I use the same code insde for loop with hardcoded numbers as shown below:

country <- word(loc[2], 19, sep=fixed(" : "))
country <- gsub(' }', '', country)

The code seems to work. But when I put it through a loop, it gives me an error

Error in word[loc, "start"] : subscript out of bounds

Please help me where it is going wrong.

class(country) 

says it is a character type. Is the way I coded the for loop wrong??

Other details: t1 is the dataframe of my table. I used Import dataset to load my file week_tweet_filtered.csv and used the command:

t1 <- week_tweet_filtered

to load the same in t1 variable. I access the place column of my table using t1$place. Also, the place column contains fields of the format:

{ "id" : "94965b2c45386f87", "name" : "New York", "boundingBoxCoordinates" : [ [ { "longitude" : -79.76259, "latitude" : 40.477383 }, { "longitude" : -79.76259, "latitude" : 45.015851 }, { "longitude" : -71.777492, "latitude" : 45.015851 }, { "longitude" : -71.777492, "latitude" : 40.477383 } ] ], "countryCode" : "US", "fullName" : "New York, USA", "boundingBoxType" : "Polygon", "URL" : "https://api.twitter.com/1.1/geo/id/94965b2c45386f87.json", "accessLevel" : 0, "placeType" : "admin", "country" : "United States" }
Community
  • 1
  • 1
kpks
  • 59
  • 1
  • 7
  • 1
    Why are you not just dealing with the json? See this question http://stackoverflow.com/questions/2061897/parse-json-with-r – Elin May 16 '15 at 01:15
  • I have tried doing that. But it throws an error saying there should be single quotes delimiting the string which my table does not have. `loc <- fromJSON(t1$place) Error in fromJSON(t1$place) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'` I have tried `loc <- fromJSON(as.character(t1$place))` as someone suggested previously but that reads only the first row. using for loop gives the same error. I believe there is something wroong with the for loop alone. – kpks May 16 '15 at 01:21

1 Answers1

0

This worked for me

x<-'{ "id" : "94965b2c45386f87", "name" : "New York", "boundingBoxCoordinates" : [ [ { "longitude" : -79.76259, "latitude" : 40.477383 }, { "longitude" : -79.76259, "latitude" : 45.015851 }, { "longitude" : -71.777492, "latitude" : 45.015851 }, { "longitude" : -71.777492, "latitude" : 40.477383 } ] ], "countryCode" : "US", "fullName" : "New York, USA", "boundingBoxType" : "Polygon", "URL" : "https://api.twitter.com/1.1/geo/id/94965b2c45386f87.json", "accessLevel" : 0, "placeType" : "admin", "country" : "United States" }'
y<-fromJSON(x)
y[['country']]

Notice that the first line encloses the json in single quotes ... I don't know if that is the problem you are having.

If you don't have the quotes try

x<-as.string(t1$place)

I don't really understand how you are getting that not as a string.

Elin
  • 6,507
  • 3
  • 25
  • 47
  • This works for me too. like when I write each one manually by adding the single quotes myself. But I can't do this for the entire table which has about 2 million records. So I wanted to use the for loop. – kpks May 16 '15 at 01:44
  • why don't you just concatenate the single quotes? – Elin May 16 '15 at 01:44
  • Forgive my ignorance, but can you please tell me how to do that. I am using r for the first time and am not aware of lot of things. – kpks May 16 '15 at 01:47