1

I'm trying to use Indeed API to search for specific jobs and I faced a problem when for loop doesn't go through each iterations. Here is the example of code that I used:

original_url_1 <- "http://api.indeed.com/ads/apisearch?publisher=750330686195873&format=json&q="
original_url_2 <-"&l=Canada&sort=date&radius=10&st=&jt=&start=0&limit=25&fromage=3&filter=&latlong=1&co=ca&chnl=&userip=69.46.99.196&useragent=Mozilla/%2F4.0%28Firefox%29&v=2" 
keywords <- c("data+scientist", "data+analyst")

for(i in keywords) { 
    url <- paste0(original_url_1,i,original_url_2)
    x <- as.data.frame(jsonlite::fromJSON(httr::content(httr::GET(url),
                                    as = "text", encoding = "UTF-8")))
     data <- rbind(data, x)
 }

Url leads to JSON file and adding one of the keyword to the url will change the JSON file. So I'm trying to repeat this for all keywords and store the result in the dataframe. However, when I'm trying to use more keywords I'm getting the result only for a few first keywords.

amonk
  • 1,769
  • 2
  • 18
  • 27
Roman
  • 27
  • 5

2 Answers2

0
 original_url_1 <- "http://api.indeed.com/ads/apisearch?publisher=750330686195873&format=json&q="
 original_url_2 <-"&l=Canada&sort=date&radius=10&st=&jt=&start=0&limit=25&fromage=3&filter=&latlong=1&co=ca&chnl=&userip=69.46.99.196&useragent=Mozilla/%2F4.0%28Firefox%29&v=2" 
 keywords <- c("data_scientist", "data+analyst")

 data<-data.table(NULL)#initialization of object

 for(i in keywords) { 
   url <- paste0(Original_url_1,i,Original_url_2)
   x <- as.data.frame(jsonlite::fromJSON(httr::content(httr::GET(url),as = "text", encoding = "UTF-8")))
   data <- rbind(data, x)
}

>dim(data)
[1] 39 31
amonk
  • 1,769
  • 2
  • 18
  • 27
  • 1
    thanks again! I really appreciate your help. I did specify the object at my code, just forgot to add it to the answer! As I am getting some results I think that the problem is with a loop itself, it just doesn’t want to do the iteration for all keywords – it just stops at some point. I was just thinking that the problem could be that when the certain keywords don’t show any results it will just throw an error and the loop will stop. I probably just need to add an IF statement and try again. What do you think? – Roman Jun 09 '17 at 15:52
  • Or it might be the API that does not let like many requests. Try `sleep()` in order to pause the loop after N times. Concerning your comment above, https://stackoverflow.com/questions/23139357/how-to-determine-if-a-url-object-in-r-base-package-returns-404-not-found – amonk Jun 09 '17 at 16:03
  • Just rerun the code using tryCatch() function and it works! So the problem was that the loop stops if the error occur. Thanks for all your help! – Roman Jun 09 '17 at 19:59
  • you might want to edi the answer adequately, incorporating the changes you made... – amonk Jun 10 '17 at 13:05
0

Here is the correct code:

original_url_1 <- "http://api.indeed.com/ads/apisearch?publisher=750330686195873&format=json&q="
original_url_2 <-"&l=Canada&sort=date&radius=10&st=&jt=&start=0&limit=25&fromage=3&filter=&latlong=1&co=ca&chnl=&userip=69.46.99.196&useragent=Mozilla/%2F4.0%28Firefox%29&v=2" 
keywords <- c("data+scientist", "data+analyst")

data <- data.frame()

for (i in keywords) {
  tryCatch({url <- paste0(original_url_1,i,original_url_2)
  x <- as.data.frame(jsonlite::fromJSON(httr::content(httr::GET(url),
                                                     as = "text", encoding = "UTF-8")))
  data <- rbind(data, x)
  }, error = function(t){})
}
Roman
  • 27
  • 5