1

I didn't know exactly how to explain the problem that I am having right now, so sorry if I am being vague in the title of the question.

What I am having right now is a list of virtual addresses that are being stored in variables. For example, I'm having

0x8c334dd

stored in a char variable. This address is the address of another variable that has data on it. What I want to do is to go to that address and get the data that is stored on it.

My assumption was that dereferencing the pointer would have been the best way to go, unfortunately I don't know the type of the variable that the address is pointing to, so how does dereferencing works in this case? I cannot do: *(char *) 8c334dd because I don't know the type of the variable that the address is pointing to...

If I cast it as an (int *) I get some of the data of some of the variables that some addresses are pointing to (remember that I have several addresses) but for others I am just getting an address, and I need the data (this variables are structs, chars, etc).

I am working with the ELF Symbol Table

Seki
  • 11,135
  • 7
  • 46
  • 70
attis
  • 171
  • 1
  • 4
  • 16
  • 1
    Assuming you have read access to the address: `char a = *(char *) 0x8c334dd;` – ouah Sep 17 '12 at 20:20
  • My assumption was that dereferencing the pointer wouldn't work here, because if I dereference a variable that has the address then I would be getting it's content (which is the same address) but not the content of the second variable, right? also I don't know what's the type of the variable that I want to get the data from.. – attis Sep 17 '12 at 20:22
  • 1
    Huh? What "second variable"? If you dereference a pointer, you get whatever is stored at that address. (Also, if this is a C question, why is it tagged C++?) – David Schwartz Sep 17 '12 at 20:23
  • hmm, so suppose I have var1 which is: int data = 34; then I have another variable that is var2 = "0x8f3343d3" which is an address that is pointing to that variable. My question is, if I do *(var2) would I be getting the "0x8f3343d3" value or the 34 from the data var?, more specifically, would var2 work the same as a pointer? even though it is not declared a pointer? – attis Sep 17 '12 at 20:25
  • @user1106581 You can only use `*` with pointers. But you can cast `var2` to `int *` before applying `*`. – ouah Sep 17 '12 at 20:29
  • @ouah ok that makes sense. But what if I don't know the type that I am expecting from the second variable? I did what you suggested and then printed out char a, but I am not getting any ASCII values... Maybe because not all of the variables are char? – attis Sep 17 '12 at 20:41
  • @David Schwartz I tagged the question as C++ because this can be easily a C++ question as well, actually the framework that I am working on is both C and C++ – attis Sep 17 '12 at 20:42
  • @user1106581 You should edit your question and explain what you are really trying to do. It looks like an [XY problem](http://mywiki.wooledge.org/XyProblem) – ouah Sep 17 '12 at 20:47
  • @ouah Just did, sorry about that. Let me know if it is clear enough.. – attis Sep 17 '12 at 20:58
  • I would suggest you rip off as much code from `gdb` as possible. It does all of this already. – David Schwartz Sep 17 '12 at 21:00
  • @DavidSchwartz yes I was thinking on that, right now I have checked both the dwarf and elf c files, but gdb would be a really hard/large library to look into... my assumption is that it must be something relatively easy to do, but I am just getting no ideas right now... – attis Sep 17 '12 at 21:04
  • Your assumption is false. It's in fact arbitrarily hard. It's just a matter of how far you want to go. The data could, in principle, have any structure imaginable. – David Schwartz Sep 17 '12 at 21:08
  • @DavidSchwartz hmm, have you worked on this before? do you, by any chance, know what modules should I be looking on in the gdb library? – attis Sep 17 '12 at 21:12
  • @user1106581: you _need_ to know the type before getting at the contents of a variable, otherwise you can only guess, and quite possibly, fail at guessing right. – ninjalj Sep 17 '12 at 21:21
  • do you want to use a library without using the header file? – moooeeeep Sep 18 '12 at 19:07

2 Answers2

3

In general, C++ or C have no way of knowing what type of pointer you have.

The usual way to solve this problem is to make the pointer point to a struct, and have a known position in the struct indicate the type of the data. Usually the known position is the first position in the struct.

Example:

// signature value; use any value unlikely to happen by chance
#define VAR_SIG 0x11223344

typedef enum
{
    vartypeInvalid = 0,
    vartypeInt,
    vartypeFloat,
    vartypeDouble,
    vartypeString,
    vartypeMax  // not a valid vartype
} VARTYPE;

typedef struct
{
    VARTYPE type;
#ifdef DEBUG
    uint32_t sig;
#endif // DEBUG
    union data
    {
        int i;
        float f;
        double d;
        char *s;
    };
} VAR;

You can then do a sanity check: you can see if the type field has a value greater than vartypeInvalid and less than vartypeMax (and you will never need to edit those names in the sanity check code; if you add more types, you add them before vartypeMax in the list). Also, for a DEBUG build, you can check that the signature field sig contains some specific signature value. (This means that your init code to init a VAR instance needs to always set the sig field, of course.)

If you do something like this, then how do you initialize it? Runtime code will always work:

VAR v;

#ifdef DEBUG
v.sig = VAR_SIG;
#endif // DEBUG
v.type = vartypeFloat;
v.data = 3.14f;

What if you want to initialize it at compile time? It's easy if you want to initialize it with an integer value, because the int type is the first type in the union:

VAR v =
{
    vartypeInt,
#ifdef DEBUG
    VAR_SIG,
#endif // DEBUG
    1234
};

If you are using a C99 compliant version of C, you can actually initialize the struct with a field name and have it assign any type. But Microsoft C isn't C99 compliant, so the above is a nightmare if you want to init your struct with a float or double value. (If you cast the float value to an integer, C won't just change the type, it will round the value; and there is no trick I know of to portably get a 32-bit integer value that correctly represents a 32-bit float at compile time in a C program.)

Compile time float packing/punning

If you are working with pointers, though, that's easy. Just make the first field name in the union be a pointer type, cast the pointer to void * and init the struct as above (the pointer would go where 1234 went above).

If you are reading tables written by someone else's code, and you don't have a way to add a type identifier field, I don't have a general answer for you. I guess you could try reading the pointer out as different types, and see which one(s) work?

Community
  • 1
  • 1
steveha
  • 74,789
  • 21
  • 92
  • 117
  • This is helpful. And gave me a couple good ideas. Unfortunately the table values are not given by me. I am working with the ELF Symbol table values, kind of a way to "decompile" any program passed to my program. By giving me the ELF symbol table I need to be able to deduct which variables are of int type, char type or struct type and then grab the value for each one of them. What I have so far is an address that is pointing to "somewhere" with an unknown type, which is just the set of addresses that we can get from the elf symbol table.. – attis Sep 17 '12 at 21:44
  • I am indeed working with someone else's tables, an API offered by ELF to be more precise. And the thing is that trying to guess the type of the pointer will be futile because the compiler doesn't complain about the type being invalid, so if I am trying to read the pointer as char then I might get a random non-ASCII char value, if I try reading it as int I might get the actual value or just another address pointing to the value.. – attis Sep 17 '12 at 21:49
0

Just wanted to add something, for people out there working with the ELF symbol table, I've found the DIEs in the DWARF file easier to work with. You can get the addresses, types and names of variables using DWARF instead of ELF, and libdwarf has good documentation.

attis
  • 171
  • 1
  • 4
  • 16