1

I'm basically trying to browse a URL with japanese letters in it. This question builds up on my first question from yesterday. My code now generates the right URL and if I just take the URL and put into my browser I get the right result, but if I try to automate the process by integrating browseURL() I get a wrong result.

E.g. I am trying to call following URL:

http://www.google.com/trends/trendsReport?hl=en-US&q=VWゴルフ %2B VWポロ %2B VWパサート %2B VWティグアン&date=1%2F2010 68m&cmpt=q&content=1&export=1

if I now use

browseURL(http://www.google.com/trends/trendsReport?hl=en-US&q=VWゴルフ %2B VWポロ %2B VWパサート %2B VWティグアン&date=1%2F2010 68m&cmpt=q&content=1&export=1)

I can see in the browser that it browsed

www.google.com/trends/trendsReport?hl=en-US&q=VW%E3%83%BB%EF%BD%BDS%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BDt%20%2B%20VW%E3%83%BB%EF%BD%BD%7C%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BD%20%2B%20VW%E3%83%BB%EF%BD%BDp%E3%83%BB%EF%BD%BDT%E3%83%BB%EF%BD%BD[%E3%83%BB%EF%BD%BDg%20%2B%20VW%E3%83%BB%EF%BD%BDe%E3%83%BB%EF%BD%BDB%E3%83%BB%EF%BD%BDO%E3%83%BB%EF%BD%BDA%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BD&date=1%2F2010%2068m&cmpt=q&content=1&export=1

which seems to be an encoding mistake. I already tried

browseURL(URL, encodeIfNeeded=TRUE)

but that doesnt seem to change a thing and as far as I interpret the function it also shouldnt because this function is there to generate those "%B" letters, which makes it even more surprising that I get them even when encodeIfNeeded = FALSE.

Any help is highly appreciated!

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 8 (build 9200)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=Japanese_Japan.932           LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.2.1
Community
  • 1
  • 1
Sket
  • 27
  • 1
  • 6
  • Thank you for your contribution. The '%2B' command is there to connect several search queries. This means that if I only search for one query, I dont have to use escaped characters. I tried this out and unfortunately the encoding is still incorrect generating basically the same link as before just shorter - which seems intuitive. – Sket Sep 01 '15 at 13:52

1 Answers1

0

I think this will get around the issue:

library(httr)
library(curl)

gt_url <- "http://www.google.com/trends/trendsReport?hl=en-US&q=VWゴルフ %2B VWポロ %2B VWパサート %2B VWティグアン&date=1%2F2010 68m&cmpt=q&content=1&export=1"

# ensure the %2B's aren't getting in the way then
# ask httr to carve up the url and put it back together
parts <- parse_url(URLdecode(gt_url))
browseURL(build_url(parts))

That gives this (too long to paste but I want to make sure OP gets to see the whole content).

I also now see why you have to do it this way (both download.file and GET with write_disk don't work due to the javascript redirect).

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
  • Short question: Which browser are you using? Because I saw a post which talked about browsers and that there are some that encode URL's themselves or something similar. I am not from a programming background or something in that field, hence I didnt completly understand what they were talking about and thus cannot recall it exactly. I am using Firefox. – Sket Sep 01 '15 at 14:37
  • Yes, I finally got it in the right format. Thank you so much! I changed my browser for the download from Firefox to Chrome and it worked immediately. The signs in the excel file still look weird and dont appear in the right formating as in your file but the values seem to be correct and thats what matters. You helped me a lot! – Sket Sep 01 '15 at 14:53
  • Apologies for the delay in responding. Aye. Chrome on my end as well. I managed to fire up a Windows 10 VM and it didn't work in Edge either (I only have Chrome & Edge on the Win10 VM) though I wasn't signed into google in Edge. Rly glad it worked! – hrbrmstr Sep 01 '15 at 15:24