1

Today I talked to a friend about the differences between statically and dynamically typed languages (more info about the difference between static and dynamic typed languages in this SO question). After that, I was wondering what kind of trick can be used in C++ to emulate such dynamic behavior.

In C++, as in other statically typed languages, the variable type is specified at compile time. For example, let's say I have to read from a file a big amount of numbers, which are in the majority of the cases quite small, small enough to fit in an unsigned short type. Here comes the tricky thing, a small amount of these values are much bigger, bigger enough to need an unsigned long long to be stored.

Since I assume I'm going to do calculations with all of them I want all of them stored in the same container in consecutive positions of memory in the same order than I read them from the input file.. The naive approach would be to store them in a vector of type unsigned long long, but this means having typically up to 4 times extra space of what is actually needed (unsigned short 2 bytes, unsigned long long 8 bytes).

In dynamically typed languages, the type of a variable is interpreted at runtime and coerced to a type where it fits. How can I achieve something similar in C++?

My first idea is to do that by pointers, depending on its size I will store the number with the appropriate type. This has the obvious drawback of having to also store the pointer, but since I assume I'm going to store them in the heap anyway, I don't think it matters.

I'm totally sure that many of you can give me way better solutions than this ...

#include <iostream>
#include <vector>
#include <limits>
#include <sstream>
#include <fstream>

int main() {
    std::ifstream f ("input_file");
    if (f.is_open()) {
        std::vector<void*> v;
        unsigned long long int num;
        while(f >> num) {
            if (num > std::numeric_limits<unsigned short>::max()) {
                v.push_back(new unsigned long long int(num));
            }
            else {
                v.push_back(new unsigned short(num));
            }
        }
        for (auto i: v) {
            delete i;
        }
    f.close();
    }
}

Edit 1: The question is not about saving memory, I know in dynamically typed languages the necessary space to store the numbers in the example is going to be way more than in C++, but the question is not about that, it's about emulating a dynamically typed language with some c++ mechanism.

Community
  • 1
  • 1
FrankS101
  • 2,112
  • 6
  • 26
  • 40
  • 1
    Ughh... which compiler lets you call `delete` on a `void*`? – Brian Bi Apr 12 '16 at 00:19
  • Yes, I know I shouldn't... – FrankS101 Apr 12 '16 at 00:20
  • Take a look at Folly's dynamic type https://github.com/facebook/folly/blob/master/folly/docs/Dynamic.md – QiMata Apr 12 '16 at 00:22
  • 1
    *This has the obvious drawback of having to also store the pointer, but since I assume I'm going to store them in the heap anyway, I don't think it matters.* If you're thinking about anyway storing numbers in the heap as separate memory allocations I don't think the whole question of using too much memory on storing the numbers matters. You're using a lot more memory just for this than you'll ever save on trying to use different types. – Sami Kuhmonen Apr 12 '16 at 00:22
  • 2
    Also, the warning is quite simple: you allocate memory for a short and then access that memory like it was allocated for a long long. That's going over the limits of memory allocation and a very bad thing. – Sami Kuhmonen Apr 12 '16 at 00:25
  • One option is to use a variant type: https://en.wikipedia.org/wiki/Variant_type – MrEricSir Apr 12 '16 at 00:27
  • ok, true about the warning, I edit the question to remove it since it is not relevant. – FrankS101 Apr 12 '16 at 00:29
  • @Sami Kuhmonen The whole thing is an example of dynamic typing in an statically typed language. The example can be better, but the question is not about saving memory, it's about emulating a dynamically typed language. – FrankS101 Apr 12 '16 at 00:32
  • In a dynamically typed language you would use a heck of a lot more than 6 bytes per number, in administrative overhead. If you're happy with that in some script language, why not in C++? Anyway the proposed idea of allocating these numbers dynamically also has extreme overhead, where the storage used for the numbers themselves is insignificant. – Cheers and hth. - Alf Apr 12 '16 at 00:39
  • `delete i` where `i` is a `void*` is invalid in standard C++ and **should not compile**. – Cheers and hth. - Alf Apr 12 '16 at 00:41
  • @Cheersandhth.-Alf it compiles with g++ 5.2.1 (with a warning) – FrankS101 Apr 12 '16 at 00:45
  • @MrEricSir The standard way to implement variants is to utilize a `union`. This, combined with needing to store the data type and correctly handle memory alignment would result in more memory usage, which is exactly what the OP seems to be trying to avoid. – paddy Apr 12 '16 at 00:48
  • @Cheers, `delete (void*)p` always have compiled. But it's **undefined behavior**. Most of the compilers may not even care for this UB for basic data types like `int, double`. BTW Frank, you may use `malloy/free` pair to avoid UB caused by `delete` and call destructor inside a `template` wrapper. Invoking destructor for any type via `template` wrapper is allowed. But I am not sure if it gives "dynamic feeling". :-) – iammilind Apr 12 '16 at 01:09
  • Maybe related: http://stackoverflow.com/q/8457961 – Kerrek SB Apr 12 '16 at 08:04

4 Answers4

6

Options include...

Discriminated union

The code specifies a set of distinct, supported types T0, T1, T2, T3..., and - conceptually - creates a management type to

struct X
{
    enum { F0, F1, F2, F3... } type_;
    union { T0 t0_; T1 t1_; T2 t2_; T3 t3_; ... };
};

Because there are limitations on the types that can be placed into unions, and if they're bypassed using placement-new care needs to be taken to ensure adequate alignment and correct destructor invocation, a generalised implementation becomes more complicated, and it's normally better to use boost::variant<>. Note that the type_ field requires some space, the union will be at least as large as the largest of sizeof t0_, sizeof t1_..., and padding may be required.

std::type_info

It's also possible to have a templated constructor and assignment operator that call typeid and record the std::type_info, allowing future operations like "recover-the-value-if-it's-of-a-specific-type". The easiest way to pick up this behaviour is to use boost::any.

Run-time polymorphism

You can create a base type with virtual destructor and whatever functions you need (e.g. virtual void output(std::ostream&)), then derive a class for each of short and long long. Store pointers to the base class.

Custom solutions

In your particular scenario, you've only got a few large numbers: you could do something like reserve one of the short values to be a sentinel indicating that the actual value at this position can be recreated by bitwise shifting and ORing of the following 4 values. For example...

10 299 32767 0 0 192 3929 38

...could encode:

10
299
// 32767 is a sentinel indicating next 4 values encode long long
(0 << 48) + (0 << 32) + (192 << 16) + 3929
38

The concept here is similar to UTF-8 encoding for international character sets. This will be very space efficient, but it suits forward iteration, not random access indexing a la [123].

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
1

You could create a class for storing dynamic values:

enum class dyn_type {
  none_type,
  integer_type,
  fp_type,
  string_type,
  boolean_type,
  array_type,
  // ...
};

class dyn {
  dyn_type type_ = dyn_type::none_type;
  // Unrestricted union:
  union {
    std::int64_t integer_value_;
    double fp_value_;
    std::string string_value_;
    bool boolean_value_;
    std::vector<dyn> array_value_;
  };
public:
  // Constructors
  dyn()
  {
     type_ = dyn_type::none_type;
  }
  dyn(std::nullptr_t) : dyn() {}
  dyn(bool value)
  {
    type_ = dyn_type::boolean_type;
     boolean_value_ = value;
  }
  dyn(std::int32_t value)
  {
    type_ = dyn_type::integer_type;
     integer_value_ = value;
  }
  dyn(std::int64_t value)
  {
     type_ = dyn_type::integer_type;
     integer_value_ = value;
  }
  dyn(double value)
  {
     type_ = dyn_type::fp_type;
     fp_value_ = value;
  }
  dyn(const char* value)
  {
     type_ = dyn_type::string_type;
     new (&string_value_) std::string(value);
  }
  dyn(std::string const& value)
  {
     type_ = dyn_type::string_type;
     new (&string_value_) std::string(value);
  }
  dyn(std::string&& value)
  {
     type_ = dyn_type::string_type;
     new (&string_value_) std::string(std::move(value));
  }
  // ....

  // Clear
  void clear()
  {
     switch(type_) {
     case dyn_type::string_type:
       string_value_.std::string::~string();
       break;
     //...
     }
     type_ = dyn_type::none_type;
  }
  ~dyn()
  {
    this->clear();
  }

  // Copy:
  dyn(dyn const&);
  dyn& operator=(dyn const&);

  // Move:
  dyn(dyn&&);
  dyn& operator=(dyn&&);

  // Assign:
  dyn& operator=(std::nullptr_t);
  dyn& operator=(std::int64_t);
  dyn& operator=(double);
  dyn& operator=(bool);   

  // Operators:
  dyn operator+(dyn const&) const;
  dyn& operator+=(dyn const&);
  // ...

  // Query
  dyn_type type() const { return type_; }
  std::string& string_value()
  {
     assert(type_ == dyn_type::string_type);
     return string_value_;
  }
  // ....

  // Conversion
  explicit operator bool() const
  {
    switch(type_) {
    case dyn_type::none_type:
      return true;
    case dyn_type::integer_type:
      return integer_value_ != 0;
    case dyn_type::fp_type:
      return fp_value_ != 0.0;
    case dyn_type::boolean_type:
      return boolean_value_;
    // ...
    }
  }
  // ...
};

Used with:

std::vector<dyn> xs;
xs.push_back(3);
xs.push_back(2.0);
xs.push_back("foo");
xs.push_back(false);
ysdx
  • 8,889
  • 1
  • 38
  • 51
  • 1
    Since C++17, the standard library includes an [std::variant](https://en.cppreference.com/w/cpp/utility/variant) type to store dynamic values. – Anderson Green May 13 '21 at 19:13
0

An easy way to get dynamic language behavior in C++ is to use a dynamic language engine, e.g. for Javascript.

Or, for example, the Boost library provides an interface to Python.

Possibly that will deal with a collection of numbers in a more efficient way than you could do yourself, but still it's extremely inefficient compared to just using an appropriate single common type in C++.

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
-1

The normal way of dynamic typing in C++ is a boost::variant or a boost::any.

But in many cases you don't want to do that. C++ is a great statically typed language and it's just not your best use case to try to force it to be dynamically typed (especially not to save memory use). Use an actual dynamically typed language instead as it is very likely better optimized (and easier to read) for that use case.

Mark B
  • 95,107
  • 10
  • 109
  • 188