0

I am building a disassembler for RISC-V binaries using the capstone engine. The issue I am facing is that after checking the input file (arch, bitness, if has any program header...) I have this for loop that iterates over all program headers looking for the ones that have executable code.

void checkElf(const char *elfFile)
{
    // Here would be the mentioned checks
    uint8_t i;
    for (i = 0; i < header.e_phnum; i++) {
        uint32_t offset = header.e_phoff + header.e_phentsize * i;
        fseek(file, offset, SEEK_SET);
        fread(&program_header, sizeof(program_header), 1, file);
        if (((PF_X | PF_R) == program_header.p_flags)) {
            dumpCode(file, &program_header, &header);
        }
    }
}

If any program header is marked as executable, then I call the following function:

static void dumpCode(FILE *file, Elf32_Phdr *segm, Elf32_Ehdr *header)
{
    int32_t *opcode;
    uint32_t offset, vaddr, i;
    char *mappedFile;
    struct stat statbuf;
    int fd;

    fd = fileno(file);
    fstat(fd, &statbuf);

    mappedFile = (char *) mmap(0, statbuf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);

    offset = segm->p_offset;
    opcode = (int *) (mappedFile + offset);
    vaddr = segm->p_vaddr;
    i = 0;

    if (0 == offset) {
        vaddr = header->e_entry;
        i = (header->e_entry - segm->p_vaddr) / 4;
        opcode += i;
    }

    for (; i < segm->p_filesz / 4; i++, vaddr += 4) {
        // do stuff...
    }
}

In that function, if the current ph starts at offset 0 (contains the elf header), I update the position of the virtual address and the opcode, if not I directly start disassembling.

My question is, should I care about where the ph containing the executable code is placed? Or better said, could the ph that contains the executable code be placed somewhere else?

Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
Josep
  • 162
  • 1
  • 9
  • Your question is _exceedingly_ unclear (and your code is broken in a multitude of ways). – Employed Russian Aug 29 '22 at 17:32
  • What does a debugger session reveal? You could at least insert some `printf()` to see what you got. – the busybee Aug 29 '22 at 19:05
  • @thebusybee ```printf``` where? What I always obtain is that the value of ```offset``` is 0, what measns that the ph contains the elf header. For me have been imposible to create a binary where the code is not placed in the ph that contains the elf header – Josep Aug 29 '22 at 19:22
  • @EmployedRussian could you provide more info?, in order I can clarify the question. How is my code broken? – Josep Aug 29 '22 at 19:24

1 Answers1

0

I think this answer answers the question you are actually asking.

Your code assumes that an executable PT_LOAD segment contains executable code and nothing else, but that is generally not the case: as the two-segment example in cited answer shows, a typical executable layout may have all of these sections: .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr in that segment, and so you'll disassemble a whole lot of garbage.

There is also absolutely no guarantee that only .text follows e_entry, so skipping the beginning of the segment up to e_entry doesn't solve anything.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • The reason i'm going directly to ```e_entry``` is because i'm only interested in ```.text``` section. The behaviour i'm expecting is like if I use objdump and I only want ```.text```, not ```.interp .init .fini```... – Josep Aug 30 '22 at 09:00
  • 1
    @Josep If that's the behavior you want, then you should use _sections_, not segments to achieve it. Using segments doesn't achieve desired result (and can't be made to achieve it). – Employed Russian Aug 30 '22 at 23:14
  • Ok, the reason I was looking in the ph is that the idea behind my tool is to extract executable instructions of the binary. I mean, if the binary has no ph, the system could not create a process, so it won't be executed. In case i look in the sections, should I process only sections with the flag ```SHT_PROGBITS``` right? – Josep Aug 31 '22 at 16:00
  • @Josep Yes, looking at just the `SHT_PROGBITS` sections is correct. Note that if the sections are removed, your tool wouldn't work. But there is no way to know what is instructions and what is "other" in that case. – Employed Russian Aug 31 '22 at 16:31