I have a Python script that queries a remote MongoDB database. The query results are pretty large (~650MB). But the documents that mongo sends are extremely compressible because they have lots of plain text. Is there some way that I can insert a compression "proxy" of some sort in between my Python script and the Mongo server - either by modifying the code or using some sort of network utility?
Asked
Active
Viewed 1,615 times
2
-
Difficult to say without seeing code: how are you requesting the data from Mongo? – Simeon Visser Nov 07 '13 at 16:18
-
I'm using PyMongo to do the queries – Thomas Johnson Nov 07 '13 at 17:24
2 Answers
2
MongoDB 3.4 added support for network compression using Snappy. Its not enabled by default though (use networkMessageCompressors=snappy
to enable it).
MongoDB 3.6 added support for zlib compression, and enabled network compression by default.
To use it in drivers (such as pymongo) add "compressors=zlib" to your configure string:
mongodb://localhost/?compressors=zlib

bjori
- 2,064
- 1
- 15
- 15
1
If you want simple solution you can connect to the remote host over compressed ssh tunnel. Assuming you run mongod on default port and 27017 is free on localhost:
ssh -C -L 27017:127.0.0.1:27017 remote_user@mongohost
It seems to work pretty well at least on dummy data. You can even use paramiko to create ssh tunnel (How to create a ssh tunnel using python and paramiko?) and keep everything wrapped inside Python code.
-
This seems like it might be a good solution. I know it's counter to ssh's purpose, but is there a way to turn off encryption so as to not slow down the transfer? – Thomas Johnson Nov 07 '13 at 17:23
-
Out of the box no. In case of OpenSSH you can recompile it to enable unencrypted connections but I think it won't make much of a difference. – zero323 Nov 07 '13 at 17:28