Tokenize negative numbers. Convert an infix expression to postfix with negative values using RPN

Question

So I'm trying to make a working calculator using RPN (reverse polish notation), however, it works perfectly fine using positive numbers and doing a simple substract like (3*4-2+5)/3*2 which postfix expression is 34*2-5+3/2* and 10 as result. The thing is I'm not sure how to implement negative numbers, such as (-4+2*2), in this case the postfix expression should be ??? (4-22*) ??? (I'm not sure about this one) but the result should be 0

I've already made a tokenizer which prints the type of each token and its value. I.e:
Input: (2*3+4-5)
Output:

(Infix expression)   | (Postfix Expression)
Parenthesis [(]        Number[2]
Number [2]             Number[3]
Operator [*]           Operator[*]
Number [3]             Number[4]
Operator [+]           Operator[+]
Number[4]              Number[5]
Operator [-]           Operator[-]
Number [5] 
Parenthesis [)]

                       Result: 5.0

However, when I try to use a negative number expression like the one above (-4+2*2) I get an error "Stack Underflow" because when I try to convert it I get:

(Infix Expression)   | (Postfix Expression)
Parenthesis [(]        Number[4]
Operator [-]           Operator[-]
Number [4]             Number[2]
Operator [+]           Number[2]
Number [2]             Operator[*]
Operator [*]           Operator[+]
Number [2]
Parenthesis [)]         ^^^^ Therefore, when I use my InfixToPostfix function, 
                             It cannot find a number before the first [4] to do the [-]
                             operation, hence, in the "pop" function it prints the "Stack Underflow" error.

How can I fix this? I'm trying to fix it in the Tokenizer function which is the follow one:

Token.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#include "token.h"


/* Works using enum within the "token.h" and the TToken struct, defined as follows: 

extern char strToken[][11];
typedef enum {variable, parentesis,operador, numero} tipoToken;
typedef struct{
    tipoToken tipo; //Type of the token
    char token[20]; //Content of that token
} TToken;

/*

char strToken[][11] = {"Variable", "Parentesis","Operador", "Numero"}; //Variable, Parenthesis, Operator, Number

int esOperador(char c) //isOperator function, which evaluates the valid nonalpha&&nondigit characters
{
    int k;
    static char operadores[] = "+-*/^";
    int l = strlen(operadores);
    for (k=0; k<l; k++)
        if (c == operadores[k])
            return 1;
    return 0;
}


int esParentesis(char c) // IsParenthesis function
{
if (c=='(' || c==')')
    return 1;
return 0;
}

TToken *token(char *s) 
{
    static char cadena[1000]; // The aux variable that will store 's' (original string)
    static char *actual;  // Keeps a track where the string is
    int k;
    char res[1000];

    TToken *unToken=NULL;


    if (s != NULL) {
        strcpy(cadena,s);
        actual = cadena;
    }

    while ((*actual == ' ') || (*actual =='\n') || (*actual=='\t')) //Only reads the valid characters of the string
        actual++;

    if (*actual == '\0')
        return NULL;
    if ( (esOperador(*actual) == 1) ) { // If is an operator, then...
        res[0] = *actual;
        res[1] = '\0';
        unToken=malloc(sizeof(TToken));
        unToken->tipo=operador;
        strcpy(unToken->token,res);
        actual = actual + 1;
        return unToken;
    } else if ( (esParentesis(*actual) == 1) ) { //If is a parenthesis, then...
        res[0] = *actual;
        res[1] = '\0';
        unToken=malloc(sizeof(TToken));
        unToken->tipo=parentesis;
         strcpy(unToken->token,res);
        actual = actual + 1;
        return unToken;
    } else if (isdigit(*actual) || (*actual=='.')) { // Read float numbers
        k = 0;
        while ( (*actual != '\0') && (isdigit(*actual) || *actual == '.' ) ) {
            res[k] = *actual;
            actual = actual + 1;
            k = k+1;
        }
        res[k] = '\0';
        unToken=malloc(sizeof(TToken));
        unToken->tipo=numero;
         strcpy(unToken->token,res);
        return unToken;
    } else if (isalpha(*actual)) { //**I might use this one also to evaluate also variables (Idk how, yet)**
        k = 0;
        while ( (*actual != '\0') && isalpha(*actual)) {
            res[k] = *actual;
            actual = actual + 1;
            k = k+1;
        }
        res[k] = '\0';
        unToken=malloc(sizeof(TToken));
        unToken->tipo=variable;
        strcpy(unToken->token,res);
        return unToken;
    }

    printf("Error: Incorrect expression\n");
    exit(0);
}

After that call to the token function, I call the InfixToPostfix expression, but in my opinion, the "correct" tokenized expression should be sent to InfixToPostfix and therefore, I should change the "token" function to allow negative numbers (unary minus).
If you do have any question do not hesitate asking (Such as another used functions in my code)

Please post an [mcve]. The code you posted will not compile. — jwdonahue, Jun 05 '20 at 03:30
I think this is incorrect: " it works perfectly fine using positive numbers and doing a simple substract like (3*4-2+5)/3*2 which postfix expression is 34*2-5+3/2* and 10 as result.". Shouldn't it be (3*4 + -2 + 5)/3*2? It's been a long time for me, but aren't you always either summing, dividing, multiplying, etc, some value? — jwdonahue, Jun 05 '20 at 03:46
@jwdonahue: their postfix is correct in that case. In postfix, you always have Expr Expr Oper, unless the operator only takes one argument. — rici, Jun 05 '20 at 04:20
@jwdonahue: Postfix has no parentheses. (And no need for them.) I never had a calculator, myself. But I learned the Shunting Yard algorithm in the 60s. — rici, Jun 05 '20 at 05:31
@chriss: There's full pseudocode for a Shunting Yard algorithm in [this answer](https://stackoverflow.com/a/16392115/1566221). If that's the algorithm you're using, it might help you. It shows how to build a simple two-state automaton which correctly figures out if `-` is prefix or infix. A calculator doesn't need to convert to postfix; it can just do the calculation. (The rest of the algorithm is unchanged.) But if for whatever reason you insist on converting to postfix, make sure you output different symbols for the two different `-` operators. Otherwise, the postfix notation is ambiguous. — rici, Jun 05 '20 at 05:40
@rici I apparently can't even comprehend postfix anymore. I hate getting old! — jwdonahue, Jun 05 '20 at 05:55
@jwdonahue: Yeah, it sucks. I don't care what anyone says. By the way, 12 - 2 + 5 is `12 2 - 5 +`. RPN is evaluated as a stack machine: numbers get pushed, and operators pop the top 2 values (or 1, if they are unary operators) and push the result. If the expression is syntactically correct, there's one value on the stack at the end, which is the result. — rici, Jun 05 '20 at 06:07
Here is a link to an online compiler (which works fine, so far): https://onlinegdb.com/HJ9n1IO3U — Izanami, Jun 05 '20 at 23:06
@rici I read your post about the shunting yard algorithm, and I sadly gotta say that I do not fully understand how to implement the unary minus in my code based in your comment. — Izanami, Jun 05 '20 at 23:08
@rici Well, I'm not sure if I should change my token.c function to create the tokens with negative numbers in each token (if necessary), or just work with the infix expression as in your comment. In the second case, I use the shunting yard algorithm in my code (look at the onlinegdb link above) in "posfija.c"; in "posfija" (postfix) function within the said .c file I've already made an algorithm to read and convert from infix notation to postfix notation (as I must do it that way), so what change should I do in that function to make it work with unary minus? — Izanami, Jun 06 '20 at 02:43
@chriss: you should treat `-` as an operator, not part of the number. (You need to be able to parse expressions like `-(2-7)/5`; unary minus is not just for numbers.) The point of the algorithm is to let you know *which* operator it is. If you're planning on doing anything other than immediate evaluation, you need to code the two operators differently in order to avoid ambiguity. — rici, Jun 06 '20 at 02:48
I'm trying to do that, adding "!" as the unary minus, so if I have `-1+2` the output would be [!][1][+][2] (infix). But then how can I handle the ! when it doesn't start with '-' and starts, for example with "(-1+2)" or some expressions like "1+2*(-3+4)-5"? I'm lost right now — Izanami, Jun 06 '20 at 04:50
@chriss: if you try to do that with your tokeniser, you end up reproducing the parsing logic in the tokeniser, which should be a clue that it is not the best solution. The pseudo code in my answer shows how to do it in the parser. — rici, Jun 06 '20 at 14:51

Tokenize negative numbers. Convert an infix expression to postfix with negative values using RPN

0 Answers0