How do I represent algebraic terms in C++ using classes: a term can be raised to a vector of terms

Question

I'm trying to make an algebraic parser, and I've gotten decently far. I'm at the point where I can successfully multiply polynomials.

Full code here. Use the constructors along with the '*' overload and convert the resulting algexpr object to a string using .latex() to multiply and print to stdout. It's alpha at best, and I know there are many optimizations I need to make, I also still need to implement division, but that's not what my question is about.

x^2 is as algebraic term. So it 3ax, 7x^2(sin(x))zx, sin(a+b), and x^(2n+1). I'm having problems with the ones I have made bold. Both of these involve algebraic expressions, even though they are algebraic terms, but an algebraic expression is a vector of algebraic terms. So in order to completely represent an algebraic term, I need an algebraic expression, but an algebraic expression is a vector of algebraic terms. You can start to see my issue. I want to be able to use algebraic expressions inside algebraic terms.

The program doesn't work with the terms in bold, nor any expressions involving them. It enters an infinite loop instead, which is understandable since I haven't been able to figure out how I'd implement this yet.

Here's a minimal reproducible example to illustrate my troubles. This is what I want to do, however, this won't work.

code.cpp:8:3: error: ‘algexpr’ does not name a type
    8 |   algexpr power; // I can't use algexpr here because it's not defined yet.

code.cpp:

class variable {
public:
  std::string name;
  algexpr power; // I can't use algexpr here because it's not defined yet.

  variable(std::string str) {
    // fancy parsing code here
    if (str == "x") {
      name = "x";
      power = algexpr("1");
    }
  }
};

class algterm {
public:
  rfraction constant;
  std::vector<variable> variables;

  algterm(std::string str) {
    // fancy parsing code here
    if (str == "parse") {
      constant = rfraction("1");
      variables.push_back(variable("x"));
    }
  }
};

class algexpr {
public:
  std::vector<algterm> terms;

  algexpr(std::string str) {
    // fancy parsing code here
    terms.push_back(algterm("parse"));
  }
};

I've tried thinking about what to do, but couldn't come up with anything that great. One idea was to store everything as strings (the expressions, variables, and terms) and just re-parse it every single time, but that's clearly not that efficient. How do I solve this?

For starters, you will need to get rid of inline method definitions. Then, it's going to be a simple matter of using forward declarations. Easy-peasy. Note that only C++17 guarantees when a vector can be instantiated with a forward-declared class. Then, after everything has been declared, you can then define your methods. Do you know what forward declarations are, and how to use them? — Sam Varshavchik, Feb 26 '23 at 17:20
@SamVarshavchik No, I do not. Could you suggest some good sources for the same? — avighnac, Feb 26 '23 at 17:23
[Which C++ textbook are you using](https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) to learn the most complicated and hardest to learn general purpose programming language in use today? You should find a complete explanation in every introductory textbook listed at that link. — Sam Varshavchik, Feb 26 '23 at 17:26
@avighnac -- Why are you using `calloc` in a C++ program? And seeing this, where do you deallocate the memory? You should be using std::string, containers, and if pointers, smart pointers. — PaulMcKenzie, Feb 26 '23 at 17:32
Well, parsing algebraic expressions is not the easiest problem to solve. You will need some form of a parse tree or abstract syntax tree, which will be very self referential, so you need to deal with pointers and forward declarations. However before trying to solve such a complicated problem in C++, I would adwise you to learn the basics of the language first. I don't mean to be condescending and this is not impossible to learn, but try to go one step at a time :) — chrysante, Feb 26 '23 at 17:33
Ok, I know the basics of forward declarations (such as how classes are split into .cpp and .hpp files and stuff), but in this particular scenario, the class depends on its another class which depends on the class that depends on the other class and so on :cry: — avighnac, Feb 26 '23 at 17:37
No, that's not what forward declaration actually means: "split into .cpp and .hpp files". That's not what it is. — Sam Varshavchik, Feb 26 '23 at 17:38
@PaulMcKenzie the "basic_math_operations" library I wrote is written in assembly and C, so I kind of need to. Now that you point it out, I have noticed that there is not a corresponding `free` to this `calloc`, but I never claimed that my code was perfect! I know there are a lot of bugs and I just posted it for context reasons. — avighnac, Feb 26 '23 at 17:39
@SamVarshavchik you can forward declare a function, and that's exactly what's done when you split a class into .cpp and .hpp files — avighnac, Feb 26 '23 at 17:39
Something can be forward-declaring without splitting anything. In one file. — Sam Varshavchik, Feb 26 '23 at 17:40
@SamVarshavchik That's not the reason; I've stated the reason in a previous comment. — avighnac, Feb 26 '23 at 17:40
@SamVarshavchik Yeah, that's possible too. I just can't figure out how to do that in this particular case. The point is, I do know what forward declarations are, but not how I'd do it in this scenario. — avighnac, Feb 26 '23 at 17:41
Draw a dependency diagram, noting which class declarations only requires the classes they reference to be forward-declared, and which ones must be defined. Declare the classes in the order that only needs the dependencies to be forward-declared. Problem solved. — Sam Varshavchik, Feb 26 '23 at 17:42
@SamVarshavchik how would I do that? `algterm` needs `algexpr`, but `algexpr` obviously also needs `algterm` — avighnac, Feb 26 '23 at 17:47
Nope, `algterm` doesn't even reference `algexpr`, directly or indirectly. `algterm` doesn't need `variable` to be defined, which requires `algexpr`, it only needs to be forward-declared as of C++17 (and many compilers allowed it even in older C++ versions). And `algexpr` only needs a forward declaration, too. Both of these classes can easily forward-declare their dependencies. But this is a pointless discussion, it's clear that it will be more productive to focus on learning C++ basics, with a good textbook, at the link that I helpfully provided in my earlier comment. — Sam Varshavchik, Feb 26 '23 at 17:53
@SamVarshavchik I mean, while that does partially remove the dependency, I can't actually access any of the members of the `algterm` class, such as the constructor which I'd need as I've clearly shown in my original question. — avighnac, Feb 26 '23 at 19:46
You can certainly access it ***after*** everything is defined. The very first sentence, of my very first comment, reads: "you will need to get rid of inline method definitions". You only ***declare*** everything first, don't ***define*** anything, not constructors, nor any methods. Then, once everything is ***declared*** you're on your own merry way to ***define*** constructors, that can access everything to its heart's desire. You can split that off into a separate `.cpp` file, if you so wish, or make it an inline definition in the same header file, it's up to you. — Sam Varshavchik, Feb 26 '23 at 19:49
I amend my previous comment to state "after everything is ***declared***", instead. Even after many years working with C++ the fine, nuanced, distinctions between forward declarations, declarations and definitions are often mixed up. Which is why it's important to thoroughly study and understand core C++ fundamentals, with a good textbook, that fully explains these fundamental concepts. Neither Stackoverflow, nor a search engine, proved historically to be a successful replacement for a textbook. — Sam Varshavchik, Feb 26 '23 at 20:09
@avighnac `char *buf = (char *)calloc(std::max(number.length(), n.number.length()) + 2, 1); add(n.number.c_str(), number.c_str(), buf);` could have been `std::vector buf(std::max(number.length(), n.number.length()) + 2, 1); add(n.number.c_str(), number.c_str(), &buf[0]);` -- That takes care of one potential memory leak. Just because the "old code" takes a pointer to `char` doesn't mean you must use the same, old, technique to create the char buffer. — PaulMcKenzie, Feb 26 '23 at 21:37
@avighnac *so I kind of need to.* -- No you didn't need to use `calloc`. When a legacy API calls for a pointer to `T` as an argument, it doesn't mean you must literally declare a `T*` on the client end. As you can see `std::vector::data()` gives you the pointer to `T` that your `add` function will accept. At the end of the function, the vector cleans itself up, and no memory leak. A vector stores its data contiguously -- thus there is litte to no reason to use `calloc`, `malloc` or even `new[]` in a modern C++ program. — PaulMcKenzie, Feb 26 '23 at 21:41
@PaulMcKenzie Woah, I didn’t know std::vector creates a memory block that can be passed to legacy APIs. Sure, I’ll use that instead of manually dynamically allocating memory. — avighnac, Feb 27 '23 at 01:06
@avighnac -- There are officially two STL containers that store the data in a contiguous block: `std::array` and `std::vector`. Both have `.data()` member functions, giving you a pointer to the memory block. If you want to count `std::string` as a container, that makes 3. — PaulMcKenzie, Feb 27 '23 at 01:53
Out of curiousity, I looked up the first recommended C++ textbook from Stackoverflow's list: the [C++ Primer](https://www.pearson.com/en-us/subject-catalog/p/c-primer/P200000000436/9780321714114). Well, what did I find in Chapter 9? "For example, `string` and `vector` hold their elements in contiguous memory". If one was focused on learning C++ from a good textbook, rather than from search engine queries, it's very likely they would've learned that, in short order. — Sam Varshavchik, Feb 27 '23 at 02:13
@SamVarshavchik I did know, and have known for a long time, that they are held in contiguous memory, I did **not** know that they could be passed to legacy APIs. For example, you can’t modify the c string returned by std::string.c_str(). But I am not ignorant to your advice, and have started revising my C++ from a textbook on that list. — avighnac, Feb 27 '23 at 07:17
The reason `c_str()`'s return value cannot be modified is because it's `const`, not because it comes from `std::string`. [`std::string::data`](https://en.cppreference.com/w/cpp/string/basic_string/data), on the other hand, just like `std::vector::data`, is mutable (unless it comes from a `const` string itself), and thus can be modified. — Sam Varshavchik, Feb 27 '23 at 12:03

chrysante · Answer 1 · 2023-02-26T18:43:15.727

Maybe you can use a parse tree design like this:

#include <cmath>
#include <string>

struct Expression {
    virtual ~Expression() = default;

    virtual double eval() const = 0; 
};

struct Variable: Expression {
    double eval() const override { 
        // Store a map of all your variable names somewhere and look it up
    }

    std::string name;
};

struct BinaryExpression: Expression {
    enum Operator { Plus, Minus, Times, Divide, Power };

    double eval() const override {
        switch (op) {
        case Plus: return lhs->eval() + rhs->eval();
        case Minus: return lhs->eval() - rhs->eval();
        // Other operators here...
        }
    }

    std::unique_ptr<Expression> lhs, rhs;
    Operator op;
};

struct FunctionCall: Expression {
    double eval() const override {
        if (functionName == "sin") { 
            return std::sin(argument->eval());
        }
        if (functionName == "cos") { /* ... */ }
        // and so on... 
    }

    std::string functionName; // E.g. "sin", "cos", etc.
    std::unique_ptr<Expression> argument;
    // ^^^ Make this a vector of unique_ptr's if you want to 
    //     have functions with more than one argument. 
};

Then e.g. your algexpr class would become a chain of BinaryExpression's with Operator::Plus. You can express exponentiation of a variable with the BinaryExpression class and Operator::Power. This way you can represent arbitrary expressions raised to the power of any other arbitrary expression.

This way you also don't really need the forward declarations because everything is implemented in terms of the Expression base class which is defined first.

This is perhaps a bit more general than you would need for only representing polynomials, but as you also want to support sin and cos functions and raising expressions to the power of other general expressions you will probably need a design like this.

Edit:

To clarify, then a + b + c would be represented like this:

                        BinaryExpression(Plus)
                          /              \
         BinaryExpression(Plus)       Variable("c")
           /          \
Variable("a")      Variable("b")

sin(a + b) like this:

         FunctionCall("sin")
                |
       BinaryExpression(Plus)  
           /          \
Variable("a")      Variable("b")

and x^(2n + 1) like this:

       BinaryExpression(Power)
          /              \
  Variable("x")       BinaryExpression(Plus)
                          /           \
          BinaryExpression(Times)   Constant(1)
             /              \
      Constant(2)       Variable("n")

The last example requires another class Constant to be added to the parse tree. Also it's probably a good idea to differentiate between variables and parameters, which I didn't do here.

Hello, firstly thank you for your answer! Could you please add code examples of how I'd use `BinaryExpression` for the examples you've provided? Also, is it viable to do it using classes, or do I have to change my approach to evade these problems? — avighnac, Feb 26 '23 at 18:55
Well, this is using classes. `struct` and `class` are the same thing in C++, if that's your question. And you construct the classes and thereby the parse tree while parsing the input. Then you use the `eval` function to evaluate the expression. Maybe have a look at this tutorial about parsing: https://www.youtube.com/watch?v=SToUyjAsaFk By the way, your code is already kind of similar to this, so I don't think you will have to change to much. Eg. `rfraction` would here be `BinaryExpression` with `op == Divide`. — chrysante, Feb 26 '23 at 19:04

How do I represent algebraic terms in C++ using classes: a term can be raised to a vector of terms

1 Answers1