Degrees of immutability.
Re
” My current goal is to read data from a file to construct an object so that it cannot be modified subsequently
there are degrees of immutability, such as:
Totally immutable.
That's the good old const
, either for the type or for individual data members. Drawback: can't be moved, so, for example, it forces copying when used as function return value. However, the compiler may optimize away such copying, and will usually do so.
Immutable but movable.
This allows efficient function return values even when the compiler doesn't optimize. Also great for passing an original temporary down a by-value call chain where the bottom function stores a copy: it can be moved all the way.
Immutable but movable and copy assignable.
Assignable may not sound as being in the same design direction as immutable, and indeed a novice may think that these attributes are in direct conflict!, but e.g. Python and Java strings are examples of this: they're immutable, but assignable. Essentially this is a neat way to encapsulate a handle-value approach. User code deals with handles, but it appears to be dealing directly with values, and if user code could change a value, then some other part of the user code holding a handle to the same value would see the change, which would be ungood in the same way as a global variable's unexpected changes. Hence the values need to be immutable, but not the user code objects (which can be just handles).
The last point shows that there's a logical design level need to distinguish internal values from user code objects.
With this point of view the first point above is about both values and objects being immutable; the second point has immutable values and generally immutable objects, but allows efficient and delightfully low level pilfering of values from temporary objects, leaving them logically empty; and the third point has immutable values but objects that are mutable with respect to both copy assignment and moving from temporaries.
Data.
For all three possibilities we can define a simple internal Data
class like this:
Data.hpp:
#pragma once
#include "cppx.hpp" // cppx::String, an alias for std::wstring
namespace my {
using cppx::String;
struct Data
{
String name;
int birthyear;
};
} // namespace my
Here cppx.hpp is a little helper file with ¹general convenience functionality, that I list at the end.
In your actual use case(s) the data class will probably have other data fields, but the main idea is that it's a simple aggregate class, just data. You can think of it as corresponding to the “value” in a handle-value approach. Next, let's define a class to use as the type of user code variables.
Totally immutable objects.
The following class implements the idea of user code objects that are totally immutable: the value set at initialization can't be changed at all, and persists until the object's destruction.
Person.all_const.hpp:
#include "Data.hpp" // my::(String, Data), cppx::*
namespace my {
using cppx::int_from;
using cppx::line_from;
using cppx::In_stream; // alias std::wistream
class Person
{
private:
Data const data_;
public:
auto operator->() const noexcept
-> Data const*
{ return &data_; }
explicit Person( In_stream& stream )
try
: data_{ line_from( stream ), int_from( stream ) }
{} CPPX_RETHROW_X
};
} // namespace my
Here
the const
for the data_
member provides the required total immutability;
the operator->
gives easy access to the Data
fields;
the noexcept
on operator->
may possibly help the compiler in some ways, but is mostly for the benefit of the programmer, namely documenting that this accessor doesn't throw;
the constructor is explicit
because at the design level it does not provide a conversion from the stream argument;
the order of the calls to line_from
and int_from
, and hence the order of consumption of lines from the stream, is guaranteed ²because this is curly braces initializer list;
the line_from
and int_from
function are <cppx.hpp>
helpers that each read one line from the specified stream and attempt to return respectively the complete line string, and the int
produced by std::stoi
, throwing an exception on failure; and
the CPPX_RETHROW_X
macro picks up the function name and retrows the exception with that name prepended to the exception message, as a primitive explicit way to get a simple call stack trace in the exception.
Instead of operator->
one could have defined an accessor method called data
, say, returning a Data const&
, but operator->
gives a very nice usage syntax, as exemplified below:
An example main program.
main.cpp:
#include PERSON_HPP // E.g. "Person.all_const.hpp"
#include <iostream>
using namespace std;
auto person_from( cppx::In_stream& stream )
-> my::Person
{ return my::Person{ stream }; }
void cppmain()
{
auto x = person_from( wcin ); // Will not be moved with the const version.
wcout << x->name << " (born " << x->birthyear << ").\n";
// Note: due to the small buffer optimization a short string may not be moved,
// but instead just copied, even if the machinery for moving is there.
auto const x_ptr = x->name.data();
auto y = move( x );
bool const was_moved = (y->name.data() == x_ptr);
wcout << "An instance was " << (was_moved? "" : "not ") << "moved.\n";
}
auto main() -> int { return cppx::mainfunc( cppmain ); }
Here cppx::mainfunc
, again a helper from <cppx.hpp>
, takes care of catching an exception and displaying its message on the std::wcerr
stream.
I use wide streams because that's the easiest way to support international characters for Windows console programs, and they also work in Unix-land (at least when one includes a call to setlocale
, which is also done by cppx::mainfunc
), so they're effecively the most portable option: they make this example most portable. :)
The code at the end doesn't make much sense for the totally immutable const
version, so let's look at movable version:
Immutable but movable objects.
Person.movable.hpp
#include "Data.hpp" // my::(String, Data), cppx::*
#include <utility> // std::move
namespace my {
using cppx::In_stream;
using cppx::int_from;
using cppx::line_from;
using std::move;
class Person
{
private:
Data data_;
auto operator=( Person const& ) = delete;
auto operator=( Person&& ) = delete;
public:
auto operator->() const noexcept
-> Data const*
{ return &data_; }
explicit Person( In_stream& stream )
try
: data_{ line_from( stream ), int_from( stream ) }
{} CPPX_RETHROW_X
Person( Person&& other ) noexcept
: data_{ move( other.data_ ) }
{}
};
} // namespace my
Note that a move constructor needs to be explicitly specified, as shown at the end above.
As g++ explains it, if one doesn't do that then
” my::Person::Person(const my::Person&)' is implicitly declared as deleted because 'my::Person' declares a move constructor or move assignment operator
Immutable but movable and assignable objects (a loophole!).
To make the objects assignable one can simply remove the = delete
declarations.
But with this the automatic move constructor is not implicitly deleted, so the explicit version of it can be removed, yielding
Person.assignable.hpp:
#pragma once
#include "Data.hpp" // my::(String, Data), cppx::*
#include <utility> // std::move
namespace my {
using cppx::In_stream;
using cppx::int_from;
using cppx::line_from;
class Person
{
private:
Data data_;
public:
auto operator->() const noexcept
-> Data const*
{ return &data_; }
explicit Person( In_stream& stream )
try
: data_{ line_from( stream ), int_from( stream ) }
{} CPPX_RETHROW_X
};
} // namespace my
This is shorter and simpler, which is good.
However, since it supports copy assignment it allows a modification of a part of the value of an instance x
.
How? Well, one way is by copying the complete Data
value out of x
, modifying that Data
instance, formatting a corresponding string with the values on two lines, using that to initialize a std::wistringstream
, passing that stream to the Person
constructor, and assigning that instance back to x
. Phew! What a roundabout hack! But it shows that it's possible, in theory, and rather inefficiently, to write e.g. a set_birthyear
function for the copy assignable Person
class. And such loopholes, sort of security holes in the type, sometimes create problems.
Still, I'm only mentioning that loophole for completeness, so that one can be aware of it – and perhaps become aware of similar functionality loopholes in other code. And I think that I would personally choose this version of the Person
class. For the simpler it is, the easier it is to use and maintain.
For completeness: the cppx support used above.
cppx.hpp
#pragma once
#include <iostream> // std::(wcerr, wistream)
#include <locale.h> // setlocale, LC_ALL
#include <stdexcept> // std::runtime_error
#include <string> // std::(wstring, stoi)
#include <stdlib.h> // EXIT_...
#ifndef CPPX_QUALIFIED_FUNCNAME
# if defined( _MSC_VER )
# define CPPX_QUALIFIED_FUNCNAME __FUNCTION__
# elif defined( __GNUC__ )
# define CPPX_QUALIFIED_FUNCNAME __PRETTY_FUNCTION__ // Includes signature.
# else
# define CPPX_QUALIFIED_FUNCNAME __func__ // Unqualified but portable C++11.
# endif
#endif
// Poor man's version, roughly O(n^2) in the number of stack frames unwinded.
#define CPPX_RETHROW_X \
catch( std::exception const& x ) \
{ \
cppx::fail( \
cppx::Byte_string() + CPPX_QUALIFIED_FUNCNAME + " | " + x.what() \
); \
}
namespace cppx {
using std::endl;
using std::exception;
using std::runtime_error;
using std::stoi;
using String = std::wstring;
using Byte_string = std::string;
using In_stream = std::wistream;
using Out_stream = std::wostream;
struct Sys
{
In_stream& in = std::wcin;
Out_stream& out = std::wcout;
Out_stream& err = std::wcerr;
};
Sys const sys = {};
[[noreturn]]
inline auto fail( Byte_string const& s )
-> bool
{ throw runtime_error( s ); }
inline auto line_from( In_stream& stream )
-> String
try
{
String result;
getline( stream, result ) || fail( "getline" );
return result;
} CPPX_RETHROW_X
inline auto int_from( In_stream& stream )
-> int
try
{
return stoi( line_from( stream ) );
} CPPX_RETHROW_X
inline auto mainfunc( void (&f)() )
-> int
{
setlocale( LC_ALL, "" ); // E.g. for Unixland wide streams.
try
{
f();
return EXIT_SUCCESS;
}
catch( exception const& x )
{
sys.err << "! " << x.what() << endl;
}
return EXIT_FAILURE;
}
} // namespace cppx
¹ I think it would be nice if the Stack Overflow C++ community could standardize on such a file, to reduce the cognitive burden of reading examples in answers, and possibly in questions too!, but I think most readers will find my (and anyone else's) helpers pretty alien at first sight, and secondly I'm just too lazy to bring this idea over to the C++ Lounge and discuss it there, which IMO would be the way to do it.
² See (Order of evaluation of elements in list-initialization).