As jogojapan actually gave the answer, I'll make this a community wiki.
IMO, this is an adequate solution:
template < typename Iter8, typename Iter32 >
Iter32 Utf8toUtf32(Iter8 _from, Iter8 _from_end, Iter32 _dest, Iter32 _dest_end);
This is intended to return what you wanted _dest
to change to.
If you really also need to return an int
, you could return a pair.
To reflect which iterators are to be read from, and which are to be written to, you could use a naming scheme for the template parameters, e.g. InputIterator8
and OutputIterator32
.
To give an analogy from a function of the Standard Library:
std::vector<int> v = {1,2,3,4};
for(auto i = v.begin(); i != v.end();)
{
if(*i == 2)
{
i = v.erase(i); // iterator invalidated and new "next" iterator returned
}
}
If you want your function a) to accept arrays and b) to be similar to Standard Library functions, I don't see any other way but to return the "changed" iterators. The only Library function I know that actually changes the iterator passed is std::advance
.
Example:
template < typename Iter8, typename Iter32 >
std::tuple<int, Iter8, Iter32> Utf8toUtf32(Iter8 _from, Iter8 _from_end,
Iter32 _dest, Iter32 _dest_end);
char utf8String [] = "...some utf8 string ...";
wchar_t wideString [ 100 ];
char* pUtf8Res = nullptr;
wchar_t* pUtf16Res = nullptr;
int res = 0;
std::tie(res, pUtf8Res, pUtf16Res) = Utf8toUtf16( begin(pIter), end(pIter),
begin(wideString), end(wideString) );
(Edit by jogojapan)
If you must keep passing the iterators as references because you want to update the text position they are pointing at, both problems described in the question cannot be solved directly.
Problem 1: Passing wideString
, which is a local array, to a function will mean its type decays to a wchar_t*
rvalue, and that cannot be bound to a wchar_t *&
non-const reference. In other words, you cannot have a function modify the address of a local array. Casting it to pointer does not change that fact, and the compiler is wrong when it accepts that solution.
Problem 2: Similarly, passing the address of nCodepoint
by reference is impossible, as that address cannot be changed. The only solution is to store the address in a separate pointer first, and then pass that:
unsigned long *pCodepoint = &nCodepoint;
Utf8toUtf32(pIter,PIter+5,pCodepoint,pCodepoint+1);
(Another edit by jogojapan)
If you want to pass by reference, but you want to make the function flexible enough to accept non-reference parameters as well, you can actually provide overloaded definitions of the template:
/* Using C++11 code for convenience. Rewriting in C++03 is easy. */
#include <type_traits>
template <typename T>
using noref = typename std::remove_reference<T>::type;
template <typename Iter8, typename Iter32>
int Utf8toUtf32 (Iter8 &from, const Iter8 from_end, Iter32 &dest, const Iter32 dest_end)
{
return 0;
}
template <typename Iter8, typename Iter32>
int Utf8toUtf32 (Iter8 &from, const Iter8 from_end, noref<Iter32> dest, const Iter32 dest_end)
{
noref<Iter32> p_dest = dest;
return Utf8toUtf32(from,from_end,p_dest,dest_end);
}
template <typename Iter8, typename Iter32>
int Utf8toUtf32 (noref<Iter8> from, const Iter8 from_end, Iter32 &dest, const Iter32 dest_end)
{
noref<Iter8> p_from = from;
return Utf8toUtf32(p_from,from_end,dest,dest_end);
}
template <typename Iter8, typename Iter32>
int Utf8toUtf32 (noref<Iter8> from, const Iter8 from_end, noref<Iter32> dest, const Iter32 dest_end)
{
noref<Iter8> p_from = from;
noref<Iter32> p_dest = dest;
return Utf8toUtf32(p_from,from_end,p_dest,dest_end);
}
You can then call this with all kinds of combinations of lvalues and rvalues:
int main()
{
char input[] = "hello";
const char *p_input = input;
unsigned long dest;
unsigned long *p_dest = &dest;
std::string input_str("hello");
Utf8toUtf32(input,input+5,&dest,&dest+1);
Utf8toUtf32(p_input,p_input+5,&dest,&dest+1);
Utf8toUtf32(input,input+5,p_dest,p_dest+1);
Utf8toUtf32(p_input,p_input+5,p_dest,p_dest+1);
Utf8toUtf32(begin(input_str),end(input_str),p_dest,p_dest+1);
Utf8toUtf32(begin(input_str),end(input_str),&dest,&dest+1);
return 0;
}
But be warned: When passing an rvalue (such as an array or an expression like &local_var
), the call will work and there will be no undefined behaviour, but of course the address of the local variable or array will of course still not change. So the caller won't, in this situation, be able to find out how many characters the function was able to process.