3

I'm trying to use the App Engine bulkloader to download entities from the datastore (the high-replication one if it matters). It works, but it's quite slow (85KB/s). Are there some magical set of parameters I can pass it to make it faster? I'm receiving about 5MB/minute or 20,000 records/minute, and given that my connection can do 1MB/second (and hopefully App Engine can serve faster than that) there must be a way to do it faster.

Here's my current command. I've tried high numbers, low numbers, and every permutation:

appcfg.py download_data 
--application=xxx 
--url=http://xxx.appspot.com/_ah/remote_api 
--filename=backup.csv 
--rps_limit=30000 
--bandwidth_limit=100000000 
--batch_size=500 
--http_limit=32
--num_threads=30 
--config_file=bulkloader.yaml 
--kind=foo

I already tried this App Engine Bulk Loader Performance and it's no faster than what I already have. The number's he mentions are on par with what I'm seeing as well.

Thanks in advance.

Community
  • 1
  • 1
DurhamG
  • 222
  • 2
  • 9

1 Answers1

3

Did you set an index on the key of the entity your trying to download?
I don't know if that helps but check if you get a warning at the beginning of the download that says something about "using sequential download"

Put this on the index.yaml to create an index on the entity key upload and wait for the index to be built.

- kind: YOUR_ENTITY_TYPE
  properties:
  - name: __key__
    direction: desc
Shay Erlichmen
  • 31,691
  • 7
  • 68
  • 87
  • That doubled the speed. I hadn't associated that warning with a speed problem. I'm now getting 10MB/minute, which is much better than before, but still well below my connection speed. Thanks for the help! – DurhamG Oct 07 '11 at 23:18