-1

I've experienced an error on GTrendsR package which other examples on StackOverflow don't deal with, that is how to loop through several searches using for or lapply functionality.

WHen I do sth simple like gtrends(ch, query = "Harvard University" , geo = "US")

I've gotten an error that doesn't occur with a do a simple search on one keyword.

Error in charToDate(x) : character string is not in a standard unambiguous format

from lapply(queries, function(x) gtrends(ch, query = x , geo = "US"))

and

for (i in seq_along(queries)) {
      x <- queries[i]
      dta[i,] <-  gtrends(ch, query = x , geo = "US")$trend   # trend data.frame returned from gtrends()
}

In case background and code are needed: I'm trying to get Google Trends search history for US college names listed in IPEDS (at this US DofEd API link)

I'm using GTrendR package at

devtools::install_bitbucket(repo = "gtrendsr", username="persican")

Doing single search terms is fine. But as soon as I try to automate, I get GTrendsR error.

library("GTrendsR", lib.loc="~/Library/R/3.1/library")

download.file("https://inventory.data.gov/dataset/032e19b4-5a90-41dc-83ff-6e4cd234f565/resource/38625c3d-5388-4c16-a30f-d105432553a4/download/postscndryunivsrvy2013dirinfo.csv" , destfile="ipeds.csv", method="curl")

colleges <- read.csv("./ipeds.csv", header=T, stringsAsFactors=F)

queries <- colleges$INSTNM  # Institution Names

prepopulating dataframe with 3 columns from gtrends function

dta <-data.frame(matrix(NA, length(queries),3)) 

set credentials

usr <- "your@gmail.com"
psw <- "yourpassword"
ch <- gconnect(usr, psw)

For loop to automate:

for (i in seq_along(queries)) {
      x <- queries[i]
      dta[i,] <-  gtrends(ch, query = x , geo = "US")$trend   # trend data.frame returned from gtrends()
}

lapply doesn't work either:

lapply(queries, function(x) gtrends(ch, query = x , geo = "US")$trend)

I get this error:

Error in charToDate(x) : character string is not in a standard unambiguous format

The error seems to be due to dependency on a charToDate() I can't seem how to get to.

However, when I use just 3 searches it works:

three <- list("Harvard University", "Boston College", "Bard College")

out <- sapply(three, function(x)  cbind.data.frame(gtrends(ch, query = x , geo = "US")$trend[3])[])
data_steve
  • 1,548
  • 12
  • 17
  • That problem is related to `as.Date`, not really to the package. `charToDate` is defined in `as.Date.character` What is the vector/data that you are passing to `gtrends`? – Rich Scriven Nov 20 '14 at 00:28
  • @RichardScriven, my code above specifies what I'm passing : a vector of US college names as search terms. based on reading the gtrends documentation, I didn't see a date option or argument – data_steve Nov 20 '14 at 01:11
  • @SO folks I'm curious why this got downvotes? It has a working example , clearly states the problem, and provides and error message. Downvoting without feedback is not helpful, particularly to a new poster. – Tyler Rinker Nov 20 '14 at 15:39

1 Answers1

0

This is because URLs/browsers become angry when there're spaces. Problem: you have search phrases with spaces. The error message is not that helpful here but @Richard asked a question that got me thinking on the right track.

So you are passing terms with spaces but google wants + rather than spaces. gsub to the rescue.

x <- "one two three"
gsub("\\s+", "\\+", x)

## [1] "one+two+three"

So now applied to the problem...Also I threw a try in there to deal with errors you may get. This will return a list of data frames.

colleges <- read.csv("ipeds.csv", header=TRUE, stringsAsFactors=FALSE)
queries <- colleges[["INSTNM"]]
dta <- data.frame(matrix(NA, length(queries),3)) 

usr <- "email@gmail.com"
psw <- "password"
ch <- gconnect(usr, psw)

output <- lapply(queries, function(x) {
    x <- gsub("\\s+", "\\+", gsub("[-,]", " ", x))
    out <- try(gtrends(ch, query = x , geo = "US")[["trend"]])
    if (inherits(out, "try-error")) return(NULL)
    out
})
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • I see your `gsub` idea but how does that solution gibe the fact that I can get `gtrend` to work with the code chunks below which also has spaces in it: `gtrends(ch, query = "Harvard University" , geo = "US")` OR `three <- list("Harvard University", "Boston College", "Bard College")` `out <- sapply(three, function(x) cbind.data.frame(gtrends(ch, query = x , geo = "US")$trend[3])[])` – data_steve Nov 20 '14 at 15:51
  • After about 30 or so mins, I get several errors: `Error : Not enough search volume. Please change your search terms.` It may be due to some of the college names being so uncommon that noone searches them regularly. The `try()` keeps it chugging. I'll try to reduce sample of colleges to 4-year publics to remove this. But then I start getting the old errors again: `Error in charToDate(x) : character string is not in a standard unambiguous format` But it's still chugging. It may be because some college names have commas in them, which screws with function syntax? – data_steve Nov 20 '14 at 16:12
  • @simpletowne Yeah I'm not sure why `gtrends(ch, query = "Harvard University" , geo = "US")` works. At the end you can find the names that threw an error and figure out what they have in common that's throwing an error. This post explains valid url characters so yeah the comma may be causing the issue: http://stackoverflow.com/a/1547940/1000343 – Tyler Rinker Nov 20 '14 at 20:26
  • Added some to catch some hyphens and commas that were being pesky – Tyler Rinker Nov 20 '14 at 20:45