asio::async_write performance limit

Question

i'm developing a server using asio to dispatch data over tcp. I have an implementation (you can see the send at the bottom of the question) very similar to the one exposed in this answer.

I added some measuremet and observed that the boost::asio::async_send method takes sometimes a long time. I was expecting (with no clue about that, just expectation) it should be a lot better, cause 100 microseconds for an asynchronous function seems a lot anyway from my point of view.

You can see the code and some measurment examples below, i'm doing something wrong or were just my expectations to be wrong?

Compiled with msvc visual studio 2019 16.2.5


std::vector<char> m_active_send_buffer_index[2];

    void do_send()
    {// Check if there is an async send in progress
        // An empty active buffer denotes no outstanding send
        if (m_stopped)
            return;

        if (m_send_buffers[m_active_send_buffer_index].size() == 0)
        {
            m_active_send_buffer_index ^= 1;//switch buffer

            auto self(this->shared_from_this());
            m_buffer_seq.push_back(boost::asio::buffer(m_send_buffers[m_active_send_buffer_index]));


            auto start = std::chrono::high_resolution_clock::now().time_since_epoch().count();
            boost::asio::async_write(m_socket, 
                    m_buffer_seq,
                    [self](const boost::system::error_code& error_code, size_t bytes_transferred)
                    {
                        std::lock_guard<std::mutex> lock(self->m_send_mutex);           
                        self->m_send_buffers[self->m_active_send_buffer_index].clear();
                        self->m_buffer_seq.clear();

                        if (!error_code)
                        {// no error occurred
                        // Check if there is more to send that has been queued up on the inactive buffer,
                        // while we were sending the active buffer

                            if (self->m_send_buffers[self->m_active_send_buffer_index ^ 1].size() > 0)
                            {
                                self->do_send();
                            }
                        }
                    }
                );
            auto stop = std::chrono::high_resolution_clock::now().time_since_epoch().count();
            static long count = 0;
            static long long time = 0;
            static long long max = 0;
            static long long min = 0xFFFFFFFF;
            static std::mutex mtx;
            mtx.lock();
            count++;
            long long t = stop - start;
            time += t;
            max = max < t ? t : max;
            min = min > t ? t : min;
            if (count == 10000)
            {
                std::cout << "last 10000 => avg: " << (time / 10000) << " - max:" << max << " - min:" << min << std::endl;
                count = 0; 
                time = 0;
                max = 0;
                min = 0xFFFFFFFF;
            }
            mtx.unlock();
        }
    }

Here some measurments example:

on my desktop machine (core i5 7400 - Windows10)

last 10000 => avg: 116434 - max:808700 - min:19200
last 10000 => avg: 120910 - max:1495700 - min:13100

on production server (xeon E5-2687W v4 - WindowsServer 2016)

last 10000 => avg: 35742 - max:198656 - min:5462
last 10000 => avg: 34844 - max:178517 - min:7509

And use `lock_guard`. In this case, looks like `std::atomic_int` or so would be enough and the mutex just has a ton of unnecessary overhead — sehe, Sep 23 '19 at 11:56
@sehe i put the `mutex` aroud the statistics collection just for convenience, so it locks even the `std::cout`. It does not affect the relevant part, anyway. @AlanBirtles i know about `strands`, do you think, besides simplification, they could help so much with performance of this specific case? — Federico, Sep 23 '19 at 12:27
No, it would make your handler faster but starting the async method would still take the same time. I assume you're using an optimised build? — Alan Birtles, Sep 23 '19 at 16:23
How do you know that you measure microseconds ? According to doc "Class std::chrono::high_resolution_clock represents the clock with the smallest tick period provided by the implementation." so you need to get the period member to get "period a std::ratio type representing the tick period of the clock, in seconds" And I do not see where you access that period ... — Jean Davy, Sep 23 '19 at 21:05

asio::async_write performance limit

0 Answers0