2

I am trying to implement methods discussed in this question to write a php function that downloads an audio file for a given string, but I can't seem to get around google's abuse protection. Results are sporadic, sometimes I get an audio file and other times it's an empty 2KB mp3 due to a response with "Our systems have detected unusual traffic from your computer network". Here is what I've got so far ( note the $file has a location in my code but for the purposes of this I've omitted it ) :

function downloadMP3( $url, $file ){    
    $curl = curl_init();

    curl_setopt( $curl, CURLOPT_URL, $url );
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
    curl_setopt( $curl, CURLOPT_REFERER, 'http://translate.google.com/' );
    curl_setopt( $curl, CURLOPT_USERAGENT, 'stagefright/1.2 (Linux;Android 5.0)' );

    $output = curl_exec( $curl );    

    curl_close( $curl );

    if( $output === false ) { 
        return false;
    }

    $fp = fopen( $file, 'wb' );
    fwrite( $fp, $output );
    fclose( $fp );

    return true;
}

$word = "Test";

$file  = md5( $word ) . '.mp3';

if ( !file_exists( $file ) ) {
    $url = 'http://translate.google.com/translate_tts?q=' . $word . '&tl=en&client=t';
    downloadMP3( $url, $file );
}
Community
  • 1
  • 1
juliusbangert
  • 627
  • 1
  • 6
  • 17
  • Hi Julius, I'll take a look at this within the next couple of hours. At first glance, this looks like it should work, though you're missing the `ie=UTF-8` in the query string. Try adding that, but I'll be back in a few hours in any case. – Chris Cirefice Dec 07 '15 at 13:32
  • I tried that and it doesn't seem to make a difference. It seems to be sporadic in that it occasionally works and then it stops working. Any ideas? – juliusbangert Dec 07 '15 at 19:32
  • I just tested the `curl` command on OSX at my university, and it works just fine. This makes me think that there's something wrong with your PHP code, or you're having network issues (maybe you're in a place that might spam Google a lot?). Unfortunately I have no knowledge of PHP nor how to run scripts on OSX or Ubuntu, so I can't really help debug your code... do `string`-type variables need to be in ***"*** characters instead of ***'*** ? After a quick Google you might need `CURLOPT_BINARYTRANSFER` as seen [here](http://php.net/manual/en/function.curl-setopt.php). – Chris Cirefice Dec 07 '15 at 21:25
  • In your tests, how are you calling the curl command it if not with php? Does it work indefinitely for you or are you cut off after about five requests? I tried CURLOPT_BINARYTRANSFER but it actually has no effect after php 5.1.3. I'm stumped and can only assume this just simply isn't going to work. – juliusbangert Dec 08 '15 at 02:32
  • Test out the `curl` command in my previous answer. It will work in any *nix terminal (OSX, linux distros, etc), or install CURL for Windows. It's a one-line command: `curl 'http://translate.google.com/translate_tts?ie=UTF-8&q=Hello&tl=en&client=t' -H 'Referer: http://translate.google.com/' -H 'User-Agent: stagefright/1.2 (Linux;Android 5.0)' > google_tts.mp3`. If the command doesn't work for you, it's definitely a network issue. If it *does* work however, your PHP code needs some work. Probably a missing/misconfigured header. – Chris Cirefice Dec 08 '15 at 02:36
  • In fact, it seems like using `curl` in PHP is actually a *bad* option. PHP has an [`http_get`](http://php.net/manual/en/function.http-get.php) function. That is *definitely* a better solution than using `curl`. You can also set HTTP headers with that function. I would try that! – Chris Cirefice Dec 08 '15 at 02:38
  • Thanks for your efforts Chris. But I tried some local command-line tests in terminal and got the same results, it manages one or two but then fails and just saves the error message into a non-playable mp3. Basically I want to make sure that this would work with no cut off point as I am planning to use it on a website where I can't predict the volume of traffic. When you try it, are you able to consistently fetch different mp3s when you make a series of back to back requests constantly? – juliusbangert Dec 08 '15 at 11:52
  • Julius, yes I have similar code working in a production Android app used by hundreds of thousands of regular users. It seems to me that you have some kind of network issue here. Perhaps your requests are coming from a region that often spams Google services. I can't say for sure... do you have experience using Wireshark? It would be helpful to capture your requests and responses to see what exactly is going on. – Chris Cirefice Dec 08 '15 at 13:36
  • I am having the same issue. Have you managed to find a solution for this? When I try curl from command line with `client=t` option, it downloads the mp3 file but it does not play. If I don't use `client=t` option, it still downloads the file but this time the file size is 0. Either way the file is not playable. I am doing this on windows. I wonder how they do this here: http://soundoftext.com/ – user1448031 Dec 21 '15 at 06:17
  • @ChrisCirefice your curl command isn't working for me either, tested it on 5 servers in 3 different data centers. – user2693017 Dec 26 '15 at 13:47
  • You currently need to send a token along (check out google translate, then speak and see the network request). I'm trying to find a way to fix this. – Rob Dec 30 '15 at 19:13
  • As @RobQuist noted, Google now requires a token (`tk` parameter in the querystring). However, if you check the GET request from using `translate.google.com`, it generates a valid one that you can then use in a cURL command. Please see [my edit to my answer on the other post](http://stackoverflow.com/a/31791632/1986871) which has the cURL working. You can add the `tk` parameter to your PHP code and it should work. Your `$url` should look like this now: `$url = 'http://translate.google.com/translate_tts?q=' . $word . '&tl=en&tk=995126.592330&client=t';` – Chris Cirefice Dec 30 '15 at 21:02

1 Answers1

1

Try another service, I just found one that works even better than Google Translate; Google Text-To-Speech API

Community
  • 1
  • 1
Rob
  • 4,927
  • 4
  • 26
  • 41