0

So im setting up a single server which will connect to and receive data from potentially hundreds of slaves. Currently i see no reason for the server to send data back to the slaves, other than a simple config file. However the clients/slaves will be sending across a zip/tar file of a substantial amount of small images (4K+).

My question is, what would be the best way to do this. Given that the clients will be both OSX, iOS and Windows systems, connecting to a single Ubuntu endpoint, i was thinking about using TCP for basic communiation and commands, and then using that to trigger a file transfer using something like UFTP or UDT to batch transfer the files.

Bare in mind that whatever system is put in place, needs to support C#/.NET for Win dev, and Python for the server. After doing a bit of research i have found that UDT, whilst written in C++, has a pretty well done C# wrapper, and a (kind of) primitive Python wrapper, so at the moment im thinking of going with something like UFTP.

So what do you guys reckon?

dbush
  • 205,898
  • 23
  • 218
  • 273
amartin94
  • 505
  • 3
  • 19
  • 35
  • Why do you think a communication protocol would be the bottleneck? hdd is very slow compared to a LAN. Your bottleneck might be the speed of local disks on the clients. Or the bottleneck might be the total network bandwidth on the server (btw, avoid physical singletons -- your hundreds of clients should be able to continue working even if the server dies). You could start with sftp/scp and see where it goes. – jfs Sep 01 '15 at 13:25
  • I wanted a more Native solution in the sense that i dont wanna have to package up WinSCP with the Win software. And im assuming that the network would be the bottleneck in the sense that we are sending thousands of tiny files across the network, and as far as testing is concerned, i havent had problems with hdd writing during testing, even on my crappy little ubuntu test box ;) – amartin94 Sep 01 '15 at 14:08
  • you could [use `paramiko` to scp in Python](http://stackoverflow.com/q/250283/4279). You could implement the equivalent of `tar -c dir/ | gzip | gpg -c | ssh user@remote 'dd of=dir.tar.gz.gpg'` in Python (you should probaly skip `gzip` step for image files -- test it). There could be something similar in C#. Make the simplest thing that works and measure its time performance. Then eliminate bottlenecks until your performance goal is reached (some steps might require a complete redesign but you shouldn't optimize prematurely). – jfs Sep 01 '15 at 14:32

1 Answers1

0

why don't you just use the ready made ftp?

  • virtually any OS has an ftp client on board; and any programming language you choose should be able to call the ftp binary.

  • installing an ftp-server on the linux server should be trivial as well. if you need full programmatic control over the server, check the pyftpdlib, packaged for Debian (and thus Ubuntu) as python-pyftpdlib

umläute
  • 28,885
  • 9
  • 68
  • 122
  • From what i've heard - FTP isnt as fast as say, a UDP based file transfer system, and also, the two examples i gave in the OP have encryption baked in. – amartin94 Sep 01 '15 at 07:59
  • in your quetsion you haven't mention that encryption is a concern for you, so why is it now? – umläute Sep 01 '15 at 18:16
  • regarding speed: if the number of transmissions is low compared to the size of the data, then the only chance one protocol is faster than the other is by using compression; but assuming that you have already compressed (zipped, gzipped) your data, the built-in compression of a transport will not buy you much (it will even be less efficient, as it tries to re-compress in vain) – umläute Sep 01 '15 at 18:17