CasperJS version 1.1.0-beta3, using phantomjs version 1.9.8 on OSX 10.10.4 64-bit.
I'm progressing with my casperjs and phantomjs experimentation, but today I hit a mystery. Please see the code that follows
var casper = require('casper').create();
casper.userAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 Safari/537.36');
casper.start('https://www.youtube.com/').viewport(1200, 800);
casper.wait(5000, function(){
casper.echo(casper.getTitle());
});
casper.run();
As you can see, it's a pretty basic getTitle() that typically works on any site or domain; but not on youtube??? I double, triple checked the syntax, tested various URL formats and a multitude of other video streaming and sharing sites; all works.
Except with youtube :\
I figured it would help if I accessed the current HTTP response for youtube
{
"contentType": null,
"headers": [],
"id": 1,
"redirectURL": null,
"stage": "end",
"status": null,
"statusText": null,
"time": "2015-12-16T04:44:28.984Z",
"url": "https://www.youtube.com/",
"data": null
}
Versus Vimeo
{
"contentType": "text/html; charset=UTF-8",
"headers": [way too many to paste all of them here]
"id": 2,
"redirectURL": null,
"stage": "end",
"status": 200,
"statusText": "OK",
"time": "2015-12-16T04:50:23.771Z",
"url": "https://vimeo.com/",
"data": null
}
I enabled verbose:true
This is Youtube
[info] [phantom] Starting...
[info] [phantom] Running suite: 4 steps
[debug] [phantom] opening url: https://www.youtube.com/, HTTP GET
[debug] [phantom] Navigation requested: url=https://www.youtube.com/, type=Other, willNavigate=true, isMainFrame=true
[warning] [phantom] Loading resource failed with status=fail: https://www.youtube.com/
[debug] [phantom] Successfully injected Casper client-side utilities
[debug] [phantom] start page is loaded
[info] [phantom] Step _step 3/4: done in 185ms.
[info] [phantom] Step _step 4/4: done in 300ms.
[info] [phantom] wait() finished waiting for 5000ms.
[info] [phantom] Done 4 steps in 5310ms
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Versus Vimeo
[info] [phantom] Starting...
[info] [phantom] Running suite: 4 steps
[debug] [phantom] opening url: http://www.vimeo.com/, HTTP GET
[debug] [phantom] Navigation requested: url=http://www.vimeo.com/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Navigation requested: url=https://vimeo.com/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://vimeo.com/"
[debug] [phantom] Navigation requested: url=https://3797665.fls.doubleclick.net/activityi;src=3797665;type=remar853;cat=Gener-;ord=1452090583?, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Successfully injected Casper client-side utilities
[debug] [phantom] start page is loaded
[info] [phantom] Step _step 3/4 https://vimeo.com/ (HTTP 200)
[info] [phantom] Step _step 3/4: done in 1028ms.
[info] [phantom] Step _step 4/4 https://vimeo.com/ (HTTP 200)
[info] [phantom] Step _step 4/4: done in 1132ms.
[info] [phantom] wait() finished waiting for 5000ms.
Vimeo: Watch, upload and share HD videos with no ads
[info] [phantom] Done 4 steps in 6137ms
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
getTitle() was successful with Vimeo.
Since getTitle() is not returning anything but a blank with Youtube, is there an alternative in the casperjs documentation or is there a specific way to casper.start Youtube or is this a new Youtube "anti scraping" measure? If you need further details let me know.
THE SOLUTION FOLLOWS BELOW!!!
First, thank you to @Vaviloff and @ArtjomB. for all their help!
The problem was solved with the suggestion by @ArtjomB.
Try to run it as casperjs --ignore-ssl-errors=true script.js
– Artjom B.
The HTTP response from Youtube after applying the suggestion follows
casperjs --ignore-ssl-errors=true quickstart.js
{
"contentType": "text/html; charset=utf-8",
"headers": [way to many to paste them all here],
"id": 2,
"redirectURL": null,
"stage": "end",
"status": 200,
"statusText": "OK",
"time": "2015-12-16T15:25:48.496Z",
"url": "https://www.youtube.com/",
"data": null
And the verbose after applying the suggestion
[info] [phantom] Starting...
[info] [phantom] Running suite: 4 steps
[debug] [phantom] opening url: https://www.youtube.com/, HTTP GET
[debug] [phantom] Navigation requested: url=https://www.youtube.com/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://www.youtube.com/"
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://ad.doubleclick.net/N4061/adi/com.ythome/_default;sz=850x250;tile=1;dc_yt=1;kbsg=HPCA151216;kga=-1;kgg=-1;klg=en;kmyd=video-masthead;ytdevice=1;ytexp=9407700,9414823,9414875,9415327,9416485,9421527,9421905,9424442,9425308,9425351,9425784;ord=7805060704704374?, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://pubads.g.doubleclick.net/gampad/ads?ad_rule=0&gdfp_req=1&iu=/6762/mkt.ythome_1x1&scp=kbsg=HPCA151216&kga=-1&kgg=-1&klg=en&kmyd=ad_creative_3&ssl=1&ytdevice=1&sz=1x1&correlator=6133483967278153, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://www.youtube.com/video_masthead?video_id=Y0tk-WkDvzs&autocrop=1&site_cta=1&textLine1=Nespresso&textLine2=&subscribe_button=0&subscriber_count=0&small_autoplay=0&video_wall=0&list=&autoplay_start_time=0&autoplay_duration=15000&cta_label=Visit our website, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://www.youtube.com/embed/Y0tk-WkDvzs?rel=0&mute=1&wmode=opaque&controls=0&showinfo=0&iv_load_policy=3&enablejsapi=1&adformat=1_8&start=0&modestbranding=1&autoplay=1&nologo=1&origin=https://www.youtube.com, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Successfully injected Casper client-side utilities
[debug] [phantom] start page is loaded
[info] [phantom] Step _step 3/4 https://www.youtube.com/ (HTTP 200)
[info] [phantom] Step _step 3/4: done in 3425ms.
[info] [phantom] Step _step 4/4 https://www.youtube.com/ (HTTP 200)
[info] [phantom] Step _step 4/4: done in 3547ms.
[info] [phantom] wait() finished waiting for 5000ms.
YouTube
[info] [phantom] Done 4 steps in 8548ms
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file:///usr/local/Cellar/casperjs/1.1-beta3/libexec/bin/bootstrap.js. Domains, protocols and ports must match.
Thank you for taking the time to read :)