Retrieving Soundcloud media URL via PHP CURL?

Question

I'm using the Soundcloud API + PHP to pull information about soundcloud tracks in order to use jPlayer as a mp3 player for playlists. Now, I am able to completely access all the information about the track except the actual link to the mp3, in order to play the track.

According to the API, if the track allows streaming, you just have to cURL the stream URL (ie: http://api.soundcloud.com/tracks/24887826/stream?client_id=xxx).

Now, If I try to cURL that, my results are a simple HTML bit <html><body>You are being <a href="http://ak-media.soundcloud.com/sjkn57c6gAA5.128.mp3?AWSAccessKeyId=AKIAJBHW5FB4ERKUQUOQ&Expires=1318411541&Signature=yaX7Noe%2F8c5dFF0H%2BGfhZ%2FX0130%3D&__gda__=1318411541_6ed9d2af39e51b5f1e94e659eff0495d">redirected</a>.</body></html>.

The link I want is the whole media.soundcloud.com/sjkn57c6gAA5 and that's it. However, if I try to xpath the '//a/@href' I get no results returned. Can anyone point me in to the right direction of how to grab this link so I can generate the appropriate link to the file?

Thanks ahead of time!

Tre

score 0 · Accepted Answer · answered Oct 12 '11 at 09:41

If you're using xpath on HTML (not recommended by the way, as HTML is not a subset of XML unless it's XHTML and doesn't require some tags to be closed), shouldn't it be //html/body/a/@href?

Otherwise, you could use a regular expression to extract the URL:

if(preg_match('/href="(.*?)"/', $m)) {
    $url = $m[1];
}

Or you could narrow it down to strip the other parts but you need to be very careful, because I doubt the API guarantees the URL will always be of the same format (e.g. the subdomain will always end with media).

What would you recommend for parsing HTML then? I've been using XPath primarily and haven't run in to a problem until now. Just regex? — tr3online, Oct 12 '11 at 09:56

Retrieving Soundcloud media URL via PHP CURL?

1 Answers1