0

I would like to create a custom lower / upper case function for wstrings.

Therefore I am using a map of integers.

Currently I am automatically creating a function from these maps:

(...)
else if (iCharCode==65)
{
    iRet=97;
}
else if (iCharCode==66)
{
    iRet=98;
}
else if (iCharCode==67)
{
    iRet=99;
}
else if (iCharCode==68)
{
    iRet=100;
}
else if (iCharCode==69)
{
    iRet=101;
}
else if (iCharCode==70)
{
    iRet=102;
}
else if (iCharCode==42818)
{
    iRet=42819;
}
(...)

However, the function is going to be pretty large if I turn my map into a if-statement like this.

I would therefore like a use a real map instead, but I don't want to load it at runtime. I would prefer having a static map, but I am not sure how to do that.

Can somebody share his thoughts?

tmighty
  • 10,734
  • 21
  • 104
  • 218
  • That was just a coincidence. Most of the times it is not +32. – tmighty Nov 19 '13 at 17:50
  • You may want to take a look at this question: http://stackoverflow.com/questions/11491/string-to-lower-upper-in-c – Dweeberly Nov 19 '13 at 17:53
  • @Dweeberly Not working for non-ASCII chars, therefore I am doing my own function. – tmighty Nov 19 '13 at 17:55
  • @Jimmy Can you delete your comment? You got so many upvotes that it is misleading. – tmighty Nov 19 '13 at 17:56
  • 1
    A `std::map` would use far too much memory. One approach is to create an array for the easy values, for example, an array of 256 `int` case-mapped values for the first 256 values, and a function for the rest. In general, a function can be much more compact than the code here; for example, `if(65 <= iCharCode && iCharCode < 71) iRet = iCharCode + 32; else ...`. – Pete Becker Nov 19 '13 at 18:05
  • ok - i'll delete it (not often I have to delete something because it gets so many upvotes ;-) have you considered using a unicode library http://userguide.icu-project.org/transforms/casemappings – Jimmy Nov 19 '13 at 18:40
  • The link I posted contained info on unicode conversions. Are you trying to do codepage conversions? Transcoding is hard and it's generally saver to find a well tested library – Dweeberly Nov 19 '13 at 19:01

2 Answers2

2

Maybe the following code can help:

#include<iostream>
#include<map>

std::map<int, int> code_map = {
  {65, 97},
  {66, 98},
  {67, 99},
  {68, 100},
  {69, 101},
  {70, 102},  
};


int main() {
  for(const auto & pair : code_map) {
    std::cout<<pair.first<<" maps to "<<pair.second<<std::endl;
  }
  return 0;
}

Compiling with g++ example.cpp -std=c++11 -Wall -Wextra (OS X 10.7.4 GCC 4.8.1) yields:

$ ./a.out 
65 maps to 97
66 maps to 98
67 maps to 99
68 maps to 100
69 maps to 101
70 maps to 102
Escualo
  • 40,844
  • 23
  • 87
  • 135
  • But I think the map is created anew each time I initiate code_map, right? – tmighty Nov 19 '13 at 17:58
  • `code_map` is only constructed once -- when the program is started. – Sam Cristall Nov 19 '13 at 18:11
  • I am getting the error "Initialization using {...} is not valid for type map". – tmighty Nov 19 '13 at 18:14
  • @tmighty: That style of initialisation is only valid in C++11. If you're stuck in the past, you'll need to write a function to initialise it, or use something like Boost.Assignment. Alternatively, use a sorted array of `pair` rather than a map, with a binary search for lookup. That's perhaps more efficient, and can be initialised statically. – Mike Seymour Nov 19 '13 at 18:17
  • @MikeSeymour I have updated from VS2010 to VS2012 now, but the IDE tells me the same error. – tmighty Nov 19 '13 at 20:16
  • The exact error is: IntelliSense: Initialization with "{...}" invalid for object of type ""std::map, std::allocator>>"" – tmighty Nov 19 '13 at 21:09
0

You could implement the map as a sorted array of pairs. This can be initialised statically. Look up a value using a binary search, e.g. std::lower_bound with a comparator comparing the first element of each pair.

As further optimisation, you might consider mapping ranges rather than individual characters, using an array of (range_begin, range_end, offset) triplets. This would reduce the whole ASCII alphabet to a single entry; but might be less effective for other alphabets.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644