0

I have a decision tree node class defined in following way:

class dt_node
{
public:    
    dt_node* child[2]; //two child nodes
    int feature;
    double value; // feature and value this node splits on
    bool leaf;
    double pred; // what this node predicts if leaf node
}

Is there a way I can write this to a file and reconstruct the tree from the file if needed?

ascetic652
  • 472
  • 1
  • 5
  • 18

3 Answers3

1

You can do it anyhow you want...

And the real answer: it really is up to you only. If I were you, and had to save this kind of object in a .txt file, I would just make up some way to save this structure, for example as 0*0*0.0*0*0.0. With the first 0 representing the number of child nodes, second 0 representing the feature value and so on, while * character being a separator between values. Spaces could work better, but I just don't like them as separators in my files... Text file would then have some other character (for example, an |) between each separated object. Example would look like 3*22*31.11*1*1.0|2*2*1.0*0*33.3.

Obviously I could've misinterpreted your qestion. If you ask is there a way of saving this particular code and execute it via opening the file in a program without the dt_node class, I, unfortunately, feel like my knowledge is not sufficent enough to answer.

Hope it helps anyhow.

Fureeish
  • 12,533
  • 4
  • 32
  • 62
1

If you would like to write the format yourself, I'll just write every other node's parameters in the file (two doubles, bool and one int) along with it's level starting from the root node and then recurrently proceeding through the tree. As I can see, the bool you have in it controls whether the node have or have not any children, this will help in the reading file process.

File reading will be a bit mode complex than file writing. For each node you read, recurrently, again, read next nodes until any node's level will be equal or lesser than the current node's. It sounds complex, but it really isn't.

Of course you shouldn't write the note* pointers to the file, as they contain useless information, as upon reading the file you will have to recreate the full tree again.

sx107
  • 309
  • 3
  • 11
1

Adding boost to your project can be a little bit of a pain, but there's quite a few libraries there including maths and graphics, so it may well be worth the effort.

The Boost serialisation docs are here with a tutorial here

The serialisation library allows you to add even just 1 function to your class which then defines how to save and load the state of that class. How that data is actually saved is then done by the boost library, for example you can have it save with binary, xml & text.

The only thing that you need to watch out for is that the binary serialisation is not machine transferable.

UKMonkey
  • 6,941
  • 3
  • 21
  • 30
  • dt_node class has two children of the same type. Do I have to consider recursion when using boost::serialization? @UKMonkey – ascetic652 Dec 14 '16 at 16:33
  • Looking more carefully, some parts of this also made it into the std library, https://isocpp.org/wiki/faq/serialization In any case, yes you will have to consider recursion, probably the simplest way to do this is to perform 2 passes; one to generate a structure with all your nodes flattened with a unique id (their pointer?), and then one to save that structure. – UKMonkey Dec 14 '16 at 16:55