2

I have a Python script that queries a remote MongoDB database. The query results are pretty large (~650MB). But the documents that mongo sends are extremely compressible because they have lots of plain text. Is there some way that I can insert a compression "proxy" of some sort in between my Python script and the Mongo server - either by modifying the code or using some sort of network utility?

Thomas Johnson
  • 10,776
  • 18
  • 60
  • 98

2 Answers2

2

MongoDB 3.4 added support for network compression using Snappy. Its not enabled by default though (use networkMessageCompressors=snappy to enable it).

MongoDB 3.6 added support for zlib compression, and enabled network compression by default.

To use it in drivers (such as pymongo) add "compressors=zlib" to your configure string:

mongodb://localhost/?compressors=zlib
bjori
  • 2,064
  • 1
  • 15
  • 15
1

If you want simple solution you can connect to the remote host over compressed ssh tunnel. Assuming you run mongod on default port and 27017 is free on localhost:

ssh -C -L 27017:127.0.0.1:27017 remote_user@mongohost

It seems to work pretty well at least on dummy data. You can even use paramiko to create ssh tunnel (How to create a ssh tunnel using python and paramiko?) and keep everything wrapped inside Python code.

Community
  • 1
  • 1
zero323
  • 322,348
  • 103
  • 959
  • 935
  • This seems like it might be a good solution. I know it's counter to ssh's purpose, but is there a way to turn off encryption so as to not slow down the transfer? – Thomas Johnson Nov 07 '13 at 17:23
  • Out of the box no. In case of OpenSSH you can recompile it to enable unencrypted connections but I think it won't make much of a difference. – zero323 Nov 07 '13 at 17:28