6

I love using Haskell but am forced to use C++ for school assignments. I am writing my own library for C++ that emulates Haskell's Prelude functions so I can write in a more concise and functional style in C++ if I wish (repo on GitHub).

A problem I have run into is implementing functions like map that operate on lists. In Haskell, String is equivalent to [Char], so you can use strings in functions that take lists. In C++, std::string is not the same thing as std::vector<char>, so I have to write multiple versions of functions to take std::string or std::vector<Type>. This works fine for functions like filter or tail because their input and output are the same type. But with map, I need to be able to transform a vector of ints into chars, or a string into a vector of bools.


When trying to run a simple pig latin converter (pigLatin.cpp on GitHub), the unwords function fails because map isn't working.

examples/pigLatin.cpp:20:29: error: no matching function for call to 'unwords'
  std::string translation = unwords(map(pigLatin, words(input)));
                            ^~~~~~~
examples/../prelude.hpp:591:15: note: candidate function not viable: no known conversion from 'std::string' (aka 'basic_string<char, char_traits<char>, allocator<char> >') to 'const std::vector<std::string>'
      (aka 'const vector<basic_string<char, char_traits<char>, allocator<char> > >') for 1st argument
  std::string unwords(const std::vector<std::string> &xs) {
              ^
1 error generated.

How can I write my map function such that it behaves like the one in Haskell (map on Hackage):

map :: (a -> b) -> [a] -> [b]

I don't know enough about the nuances of C++ templates to figure this one out. This is what I have so far (map from prelude.hpp on GitHub):

// map :: (a -> b) -> [a] -> [b]
template <typename Function, typename Input, typename Output>
std::vector<Output> map(const Function &f, const std::vector<Input> &xs) {
  const int size = xs.size();
  std::vector<Output> temp;
  for (int i = 0; i < size; ++i) {
    temp.push_back(f(xs[i]));
  }
  return temp;
}

// map :: (String -> a) -> String -> [a]
template <typename Function, typename Output>
std::vector<Output> map(const Function &f, const std::string &xs) {
  const int size = xs.size();
  std::vector<Output> temp;
  for (int i = 0; i < size; ++i) {
    temp.push_back(f(xs[i]));
  }
  return temp;
}

// map :: (a -> String) -> [a] -> String
template <typename Function, typename Input>
std::string map(const Function &f, const std::vector<Input> &xs) {
  const int size = xs.size();
  std::string temp;
  for (int i = 0; i < size; ++i) {
    temp += f(xs[i]);
  }
  return temp;
}

// map :: (String -> String) -> String -> String
template <typename Function>
std::string map(const Function &f, const std::string &xs) {
  const int size = xs.size();
  std::string temp;
  for (int i = 0; i < size; ++i) {
    temp += f(xs[i]);
  }
  return temp;
}
evanrelf
  • 303
  • 3
  • 12
  • The first thing I see is that in `template `, the relationship between `Function` and `Input` and `Output` is not specified. How about `std::vector map(Func, const std::vector &xs)`, for a suitable definition of `Func`. – luqui Dec 10 '17 at 20:12
  • @luqui C++ has SFINAE, which somewhat encourages the original style, I think, since it produces a more general template. – chi Dec 10 '17 at 20:17
  • 1
    The issue here seems to be that `map(pigLatin, words(input))` is producing a string instead of a vector -- the wrong overload is being chosen. – chi Dec 10 '17 at 20:20
  • @chi Exactly, and I don't know how to tell it to use the correct one. – evanrelf Dec 10 '17 at 20:21
  • `map` was working fine when I only allowed `std::vector -> std::vector` and `std::string -> std::string`. One I tried allowing them to cross over, it stopped working. – evanrelf Dec 10 '17 at 20:24
  • 1
    Really neat idea! And a great learning exercise. Two factors to consider: 1. Your `std::vector` implementations will be strictly evaluated, and to get lazy behavior, you would need a container with thunks. 2. If you’ll be copying a lot of large list elements, consider storing `std::shared_ptr` in your vectors. So you might want a variant of `map` and catamorphisms that does this. – Davislor Dec 11 '17 at 12:00
  • 1
    An interesting special case you can overload and optimize: if your function has the same input and output type, and the input `vector` is a non-const rvalue reference, you could create the output `vector` in-place. – Davislor Dec 11 '17 at 12:03
  • 1
    Also, if the input is an rvalue, you could do an `.emplace_back(f(std::move(x)))` and possibly get a more-efficient implementation for some `f`. E.g., if `f` is scalar multiplication, you could get the behavior of `*=` rather than creating a temporary. – Davislor Dec 11 '17 at 20:40
  • a bystander question: have you considered using iterators and std::transform? could it work? – Will Ness Dec 12 '17 at 09:40

1 Answers1

8

In this declaration:

template <typename Function, typename Input, typename Output>
std::vector<Output> map(const Function &f, const std::vector<Input> &xs);

Output is a non-deduced context. The compiler will deduce the type of Function and Input from the provided arguments, but Output cannot be deduced - it would have to be explicitly provided. That's not going to happen.

What you want to do is yourself compute what Output is as a function of Function and Input. With a C++17 compiler/library, that's std::invoke_result_t (on C++11, use result_of). That is:

template <typename Function, typename Input,
    typename Output = std::invoke_result_t<Function const&, Input const&>>
std::vector<Output> map(const Function &f, const std::vector<Input> &xs);

It's a defaulted template parameter, but since the user won't actually ever provide it, the default argument will get used, which is what you want. Now this isn't entirely right either, since invoke_result_t could return something that you can't put in a vector (like, a reference). So we need to std::decay it. Moreover, you'll want to reserve the output vector since we know up front what its size will be:

template <typename Function, typename Input,
    typename Output = std::decay_t<std::invoke_result_t<Function&, Input const&>>>
std::vector<Output> map(Function&& f, const std::vector<Input> &xs)
{
    std::vector<Output> res;
    res.reserve(xs.size());
    for (auto&& elem : xs) {
        res.push_back(f(elem));
    }
    return res;
}

Now, if you want this to be able to take a string and either return a vector<X> or a string, that's now getting pretty complicated. You can't overload on return type in C++, so providing those two overloads is ill-formed. It happens to work in your case now, since the string --> vector<X> overload will be removed from consideration due to X not being deducible. But once you fix that, you'll run into that problem.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • It sounds like this is the best answer. It's disappointing this is so tedious to deal with in C++. Is there no way to specify the type of the function argument? Something like `... map(const Function &f ...` to only accept functions that take an `int` and return a `bool`. – evanrelf Dec 10 '17 at 21:29
  • @evanrelf Well, you can take a `std::function` but I'm not sure why you'd actually want to do that. Why do you think you need to specify the type of the function argument? You *really* don't. – Barry Dec 10 '17 at 21:33
  • So could I write `filter` using `std::function` to restrict the first argument to only functions that take the type that is in the container and return a `bool`? – evanrelf Dec 10 '17 at 21:37
  • @evanrelf No, don't do that. The right way to do that is to verify that `std::invoke_result_t` is `bool` (or, really, is convertible to bool). (i.e. with `std::enable_if`) – Barry Dec 10 '17 at 21:43
  • @Barry In the future, do you think we will also include C++ concepts among the "right" ways? – chi Dec 10 '17 at 21:56
  • @chi Once we have a compiler that implements the working draft's definitions of concepts, sure :-) – Barry Dec 10 '17 at 21:59