0

I am writing a program in c which imitates an LC-3 simulator. One of the objectives of this program is to store a 4 digit hexadecimal value from a file (0000 - ffff), and to convert it to binary, and interpret an LC-3 instruction from it. The following code segment shows how I am storing this value into a variable (which is where the problem seems to lie), and below that is the output I am receiving:

int *strstr(int s, char c);
void initialize_memory(int argc, char *argv[], CPU *cpu) {
    FILE *datafile = get_datafile(argc, argv);


// Buffer to read next line of text into
#define DATA_BUFFER_LEN 256
    char buffer[DATA_BUFFER_LEN];
    int counter = 0;
    // Will read the next line (words_read = 1 if it started
    // with a memory value). Will set memory location loc to
    // value_read
    //
    int value_read, words_read, loc = 0, done = 0;
    char comment;
    char *read_success;    // NULL if reading in a line fails.
    int commentLine =0;
    read_success = fgets(buffer, DATA_BUFFER_LEN, datafile);


    while (read_success != NULL && !done) {
            // If the line of input begins with an integer, treat
            // it as the memory value to read in.  Ignore junk
            // after the number and ignore blank lines and lines
            // that don't begin with a number.
            //
            words_read = sscanf(buffer, "%04x%c", &value_read, &comment);


            // if an integer was actually read in, then
            // set memory value at current location to
            // value_read and increment location.  Exceptions: If
            // loc is out of range, complain and quit the loop. If
            // value_read is outside 0000 and ffff, then it's a
            // sentinel -- we should say so and quit the loop.
            if (value_read == NULL || comment ==';')
            {
                commentLine = 1;
            }
            if (value_read < -65536 || value_read > 65536)
            {
                printf("Sentinel read in place of Memory location %d: quitting loop\n", loc);
                break;
            }
            else if (value_read >= -65536 && value_read <= 65536)
            {
                if (commentLine == 0)
                {
                    if (counter == 0)
                    {
                        loc = value_read;
                        cpu -> memLocation = loc;
                        printf("\nPC location set to: x%04x \n\n", cpu -> memLocation);
                        counter++;
                    }
                    else
                    {
                        cpu -> mem[loc] = value_read;       
                        printf("x%04x : x%d\t %04x \t ", loc,loc, cpu -> mem[loc]); 
                        print_instr(cpu, cpu -> mem[loc]);
                        loc++;
                        value_read = NULL;
                    }
                }
            }

            if (loc > 65536)
            {
                printf("Reached Memory limit, quitting loop.\n", loc);
                break;
            }
            commentLine = 0;
            read_success = fgets(buffer, DATA_BUFFER_LEN, datafile);
            // Gets next line and continues the loop

    }
    fclose(datafile);
    // Initialize rest of memory

    while (loc < MEMLEN) {
            cpu -> mem[loc++] = 0;
    }
}

My aim is to show the Hex address : decimal address, the hex instruction, binary code, and then at the end, its LC-3 instruction translation. The data I am scanning from the file is the hex instruction:

x1000 : x4096    200c    0010000000001100       LD, R0, 12
x1001 : x4097    1221    0001001000100000       ADD, R1, R0, 0
x1002 : x4098    1401    0001010000000000       ADD, R2, R0, R0
x1003 : x4099    ffff94bf        0000000000000000       NOP
x1004 : x4100    166f    0001011001101110       ADD, R3, R1, 14
x1005 : x4101    1830    0001100000110000       ADD, R4, R0, -16
x1006 : x4102    1b04    0001101100000100       ADD, R5, R4, R4
x1007 : x4103    5d05    0101110100000100       AND, R6, R4, R4
x1008 : x4104    5e3f    0101111000111110       AND, R7, R0, -2
x1009 : x4105    5030    0101000000110000       AND, R0, R0, -16
x100a : x4106    52ef    0101001011101110       AND, R1, R3, 14
x100b : x4107    5fe0    0101111111100000       AND, R7, R7, 0
x100c : x4108    fffff025        0000000000000000       NOP
x100d : x4109    7fff    0111111111111110       STR, R7, R7, -2

As you can see, my problem lies in addresses x1003 and x100c;

As stated in the headline, when storing the hex instruction, if the value is between 8 and f, my best guess is that the scan is interpreting it as a negative value because of the leading value of the first hex digit in binary. If that is the case, it makes perfect sense, but is there a way I can bypass this? And if it isn't the case, what else could be causing this?

I found that if I pass value_read into print_instr() instead of cpu -> mem[loc], then the output works correctly. However, this is only a temporary fix as I need to store that value for later use in the program(for actual execution of the instruction). So the problem seems to arise while storing, and I am unsure as to why.

Additionally, (and this is a side question) though it is not a pressing concern, since I am using %x%c (value_read, comment) to store values from the file, I have been having trouble with the first few lines of the .hex file I am using, in which there is no hex value in the line, but instead just a comment symbol (for those unfamiliar with lc_3 simulators, the ';' is the symbol for comments). Whenever this is the case, I get a hex value of zero, although I wish for it to be NULL(In my program, I implemented a temporary solution because I am not sure how to fix it). I am not an expert in c just yet, and have not been able to find a solution to this problem. If you can help, it would be greatly appreciated, otherwise, it isn't a big issue for what I am trying to achieve with this program, it is more so just for my own knowledge and growth.

Thank you all in advance for your help :)

2 Answers2

1

In a scanf family format string, the %x specifier means to read into an unsigned int. The corresponding argument must have exactly the type unsigned int *.

However you supply an argument of type int *.

This causes undefined behaviour. What you are seeing is the chance interaction between library elements that expect you to follow the rules, and your code that didn't follow the rules.

To fix it, follow the rules. For example, read into an unsigned int variable.

NB. 0 does nothing in the scanf format string; %04x is equivalent to %4x.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • I did this, and changed the restriction on my else, if else statements so that the program would show its output. However, I am still getting the same output. – Lucas Trestka Nov 06 '17 at 06:10
  • @Jean-BaptisteYunès there are a lot of problems in your code; an accurate answer would require you to post your exact code in the first place. Your question title did say you had a problem with storing, not printing – M.M Nov 06 '17 at 07:22
1

May I suppose that cpu->mem is of type array of short or alike? Then sign extension occurs when printing cpu->mem[loc]. Remind that arguments are at least converted to int at printf calls. Symptom is the same as in the following code:

int i;
scanf("%4x",&i);
printf("%x\n",i);
short s = i;
printf("--> %x\n",s);

The short equals to -1 then when you set it to an int it is converted to -1, 0xffffffff (if 32-bits).

Use unsigned short in place.

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69