0

Currently I am making a program in which I estimate WiFi device coordinates by using the RSSI. The program contains a bottleneck.

I have tried replacing the string comparison with other functions. That didn

The full function:

std::list<std::list<wSignal*>> SignalGrouper::groupByMac (std::list<wSignal*> signals)
{
    std::list<std::list<wSignal*>> groupedSignals;
    std::list<wSignal*> doneSignals;
    for (std::list<wSignal*>::iterator it1=signals.begin(); it1 != signals.end(); ++it1) //take first signal
    {
        if(DoesSignalExist(doneSignals, *it1) == false) //check if signal is already been grouped
        {
            std::list<wSignal*> group;
            for (std::list<wSignal*>::iterator it2=signals.begin(); it2 != signals.end(); ++it2)
            {
                if(DoesSignalExist(doneSignals, *it2) == false)
                {
                    if(boost::iequals((*it2)->MAC, (*it1)->MAC))
                    {
                        group.push_back(*it2);
                        doneSignals.push_back(*it2);
                    }
                }
            }
            groupedSignals.push_back(group);
        }
    }
    return groupedSignals;
}
Rik Smits
  • 63
  • 9
  • 1
    How did you determine it's the bottleneck? Was is statistical sampling of stack traces? Because if you have billions of string, naturally the code will spend a lot of time in `(*it2)->MAC == (*it1)->MAC`. You can't get more efficient than the dedicated function. – StoryTeller - Unslander Monica Jul 19 '17 at 12:16
  • The `std::string` compare functions first checks the string lengths, they will be equal here because you have 2 MAC addresses. If they are equal every character will be compared until the first difference or `\0` is found. So there is no faster way... As StoryTeller already said: Is that really the bottleneck? – Andre Kampling Jul 19 '17 at 12:19
  • Try changing `signals` to be an `std::map` with the `MAC` as the key. – Richard Critten Jul 19 '17 at 12:20
  • 4
    You pass your list by copy ?! – Jarod42 Jul 19 '17 at 12:20
  • 2
    You can `sort` your signals by MAC to reduce complexity of algorithm from O(N²) to `O(N logN)`. – Jarod42 Jul 19 '17 at 12:22
  • 2
    The linked list is the most overrated data structure. – molbdnilo Jul 19 '17 at 12:22
  • 1
    What if `it1 == it2`? No need to compare that. You are also comparing every element twice. – Some programmer dude Jul 19 '17 at 12:22
  • Also think of storing your `wSignal` by value in a container and not the pointer as it needs one less indirection. – Andre Kampling Jul 19 '17 at 12:23
  • If you are using Linux you can trace the bottleneck with `perf`. Use e.g. `perf top pid`, you will exacly see what function call will consume most cpu cycles: https://gist.github.com/hrwgc/9750190 – Andre Kampling Jul 19 '17 at 12:52
  • 4
    Since MAC address is exactly 6 bytes, you can use `uint64_t` for storing and comparing items. – vasek Jul 19 '17 at 12:54
  • @AndreKampling I did. In the report came on the top a process with overhead of 87,70%. But I can't really see what function it uses. This is the row in perf: 87,70% LocatieApp LocatieApp [.] _ZN5boost9function2IvRKNS_6system10error_codeESt4pairINS_4asio2ip23basic_resolver_iteratorINS7_3udpEEESA_EED1Ev – Rik Smits Jul 19 '17 at 13:25
  • @RikSmits: Compile and link with debug informations: `-g`. But as the other comment already said: pass the list by const reference to the function. – Andre Kampling Jul 19 '17 at 13:27

4 Answers4

1

I am also sceptical whether the string comparison is the real problem. But if you insist on faster way to compare MAC strings, you could try comparing in reverse, since the prefix (OUI) is given to vendor by IEEE and therefore is always the same for the same vendor.

mmatous
  • 482
  • 6
  • 16
1

Has it to be a std::list to be returned? Otherwise you could reduce the iterating steps by using a std::map like this:

std::map<MAC, std::list<wSignal*>> SignalGrouper::groupByMac (std::list<wSignal*> signals)
{
    std::map<MAC, std::list<wSignal*>> groupedSignals;
    for (std::list<wSignal*>::iterator it1 = signals.begin(); it1 != signals.end(); ++it1) //take first signal
    {
        std::map<MAC, std::list<wSignal*>>::iterator it2 = groupedSignals.find((*it1)->MAC);
        if(it2 != groupedSignals.end()) {
            it->second.push_back(*it1);
        } else {
            groupedSignals[(*it1)->MAC] = (*it1);
        }
    }
    return groupedSignals;
}

Not tested, but should work something like that.

cruesli
  • 26
  • 4
0

Try

#include <boost/algorithm/string.hpp>

boost::equals((*it2)->MAC, (*it2)->MAC);

or for case insensitive comparison

boost::iequals((*it2)->MAC, (*it2)->MAC);
cruesli
  • 26
  • 4
  • Improving the string comparison like this didnt work. Any suggestions where else the bottleneck could be? In this function I am trying to sort a std::list to smaller std::list in which the signals all have the same MAC. I want those smaller lists contained in one list: std::list> – Rik Smits Jul 19 '17 at 13:09
0

No, there is no faster way to directly compare two arbitrary strings. The built-in method is the fastest way.

Rather than your current O(n^2) algorithm, you could sort the MAC list first (eg. by putting them into a std::vector then using std::sort) in O(n log n), then just run a single iteration over the sorted vector to aggregate the adjacent equal elements into a list of groups (which is O(n), for an overall complexity of O(n log n)).

With a large number of MACs to group, it's likely that an algorithm complexity change like this will cause a larger performance increase than trying to optimise that single line.

hnefatl
  • 5,860
  • 2
  • 27
  • 49