Until you've seen the profiler output, to know where the
slowdown is, you can't be sure, but there are a number of points
which seem likely to cause a slowdown to me. The two most
important are:
your function creates two new strings at each call. That can
be very expensive, and
you use the two operand form of std::tolower
; this function
must extract the ctype
facet each time it is called (and you
construct a new temporary instance of the locale each time you
invoke lowercomp
.
My own preference is to use a functional object for the
comparison. With some compilers, it's faster, but in this case,
it's also a lot cleaner:
class CaseInsensitiveCompare
{
std::locale myLocale; // To ensure lifetime of the facet.
std::ctype<char> const& myCType;
public:
CaseInsensitiveCompare( std::locale const& locale = std::locale( "" ) )
: myLocale( locale )
, myCType( std::use_facet<std::ctype<char>>( myLocal ) )
{
}
bool operator()( AbstractItem const* lhs, AbstractItem const* rhs ) const
{
return (*this)( lhs->title(), rhs->title() );
}
bool operator()( std::string const& lhs, std::string const& rhs ) const
{
return std::lexicographical_compare(
lhs.begin(), lhs.end(),
rhs.begin(), rhs.end(),
*this);
}
bool operator()( char lhs, char rhs ) const
{
return myCType.tolower(lhs) < myCType.tolower(rhs);
}
};
Beyond this, there are several other points which might improve
performance:
If you're sure of the lifetime of the locale
you're using
(and you usually can be), drop the myLocale
member in the
class; copying the locale will be the most expensive part of
copying instances of this class (and the call to
lexicographical_compare
will copy it at least once).
If you don't need the localization features, consider using
the tolower
function in <cctype>
, rather than the one in
<locale>
. This will avoid the need of any data members at all
in the comparison.
Finally, although I'm not sure it's worth it for something as
small as 10K items, you might consider making vectors with the
canonical forms of the strings (already lower cased, etc.),
sorting those using just <
on the strings, and then reordering
the original vectors according to that.
Also, I'm very suspicious of the `new
boost::timer::auto_cpu_timer'. Do you really need dynamic
allocation here? Off hand, I suspect a local variable would be
more appropriate.