0

I am an experienced network programmer and am faced with a situation where i need some advice.

I am required to distribute some data on several outgoing interfaces (via different tcp socket connections, each corresponding to each interface). However, the important part is, i should be able to send MORE/most of the data on the interface with better bandwidth i.e. the one that can send faster.

The opinion i had was to use select api (both unix and windows) for this purpose. I have used select, poll or even epoll in the past. But it was always for READING from multiple sockets whenever data is available.

Here i intend to write successive packets on several interfaces in sequence, then monitor each of them for write descriptors (select parameter), then which ever is available (means it was able to send the packet first), i would keep sending more packets via that descriptor.

Will i be able to achieve my intension here? i.e. if i have an interface with 10Mbps link vs another one with 1Mbps, i hope to be able to get most of the packets out via the faster interface.

Update 1: I was wondering what would be select's behavior in this case, i.e. when you call select on read descriptors, the one on which data is available is returned. However, in my scenario when we are writing on the descriptors and waiting for select to return the one that finished writing first, does select ensure returning only when entire packet is written i.e. say i tried writing 1200 bytes in one go. Will it only return when entire 1200 are return or there is a permanent error? I am not sure how would select behave and failed to find any documentation describing that.

fkl
  • 5,412
  • 4
  • 28
  • 68
  • Do you have **one** data source which shall be send via **all** the different connections, or do you have **multiple** sources, one per connection for example? – alk Jun 13 '13 at 09:10
  • A single data source i.e. a file which i am required to divide in chunks and send across all interfaces with one fixed size chunk at a time. – fkl Jun 13 '13 at 09:27

2 Answers2

1

I'd adapt the producer/consumer pattern. In this case one producer and several consumers.

Let the main thread handle your source (be the producer) and spawn off one thread for each connection (being the consumers).

The treads in parallel pull a chunk of the source each and send it over the connection one by one.

The thread holding the fastest connection is expected to send the most chunks in this setup.

alk
  • 69,737
  • 10
  • 105
  • 255
  • Thanks @alk, i agree that this was a possible solution in my mind as well. But wanted to refrain from implementing several threads unless i am certain that is the way. Can you think of a benefit the select approach won't have that this one would do? I have added an update that i thought of and wasn't sure of about select in the question above. Would appreciate your comments – fkl Jun 13 '13 at 11:51
1

Using poll/epoll/select for writing is rather tricky. The reason is that sockets are mostly ready for writing unless their socket send buffer is full. So, polling for 'writable' is apt to just spin without ever waiting.

You need to proceed as follows:

  1. When you have something to write to a socket, write it, in a loop that terminates when all the data has been written or write() returns -1 with errno == EAGAIN/EWOULDBLOCK.

  2. At that point you have a full socket send buffer. So, you need to register this socket with the selector/poll/epoll for writability.

  3. When you have nothing else to do, select/poll/epoll and repeat the writes that caused the associated sockets to be polled for writability.

  4. Do those writes the same way as at (1) but this time, if the write completes, deregister the socket for writability.

In other words you must only select/poll for writeability if you already know the socket's send buffer is full, and you must stop doing so immediately you know it isn't.

How you fit all this into your application is another question.

user207421
  • 305,947
  • 44
  • 307
  • 483